We don't do OCR, we have people who scan the documents if you snailmail them to ...

We don't do OCR, we have people who scan the documents if you snailmail them to us (or you can upload the scans directly to your account, email them in etc.) and then convert these images into queryable data with special emphasis on receipts (also invoices, bills, bank withdrawals, etc) and business cards. It's a service that our customers are happy to pay for, so I think I'm not quite getting your point about why this model won't work.

Had it been any one the issues I've pointed out, I would be quite happy to work around them. I do that routinely with other pieces of technology that do 95% of what I want. However, the hits kept coming. At the same time, I was using Postgres on another project and the question was begging to be asked - if I am going to run extra services like Lucene and do extra work to achieve full text search, what am I getting in return? And I am sorry to say - case insensitive search is quite a basic feature.

As for the 1111 issue - I admit it can be fixed. But the fact still remains - there's no way I would face this problem with Postgres. So why not just switch?

Postgres is a really nicely thought out and well executed database. And don't get me wrong - I am not building a system designed to operate at stratospheric scale. That would be premature in my opinion.

Thanks for taking the trouble to comment.