This project utilizes PHP, ImageMagick and SQLite to create an image comparison search engine. You can feed the PHP functions an image file and allowed difference per pixel in percentage. It will then return back an array of ids that are within the percentage range given.
The comparison method used is to resize the images to 6×6 thumbnails, convert to greyscale and compute the mean average for the entire thumbnail. The difference from the mean average is then computed for each pixel, divided by 2, and stored in the SQLite database. The difference value given is divided by 2 to allow it to fit within a single byte, which improves speed while minimizing database size.The SQLite table scheme consists of fields “p0” through “p35”, along with an id, timestamp, and mean average.
To find similar images, the target image is first analyzed with the above method. A percentage value is used against the returned values to obtain minimum and maximum ranges that each pixel can deviate from the target image. These values are then used to create an SQL query consisting of 72 boolean statements, which define the range. SQLite 2 limited you to 100 statements per query, which is why I opted for 6×6 thumbnails instead of 8×8 which the original algorithm suggested. I do not believe this limitation exists with SQLite 3. Indexes are initialized on all pixel fields.
Querying is actually quite fast, especially once indexes are placed on each pixel field. I do not have hard benchmarks, but will post them when I have taken more time to evaluate performance.
Limitations:
- Some math done within PHP instead of outside library, performance could probably be improved using a different language(e.g. Java/C++);
- Dependent on specific builds of ImageMagick. Different builds of ImageMagick may result in different values being supplied, throwing off results.
You can now download the code at the imagecompare GitHub repo.