How to Hook Up lmdb with Caffe using Python!
Long story short, here’s how I figured out how to interact with lmdb using Python.
First, a bit of setup:
You’ve probably noticed two unfamiliar packages caffe.proto
and caffe.io
.
The caffe.proto
package defined a lot of things, but here we are only using the data structure Datum
, you can think of it as an intermediate form between our images and labels and lmdb entries.
The caffe.io
package has this two helper functions datum = datum_to_array(X, y)
and X = array_to_datum(datum)
, which could save us some time defining the structure of our Datum
object. Note that y
is not returned by array_to_datum
, you can simply call datum.label
to get it.
Function to write to lmdb:
This function takes a folder path img_dir
as input and push all the images in the folder into a lmdb database specified by db_name
.
map_size
is the capacity of the database. In my case I have total of len(files)
image patches of size 64*64*3
, and I used a factor of 2 just in case.
I’ve also set y=1
for all images because currently I don’t have labels yet. You will need to make changes according to your situation.
Function to read from lmdb:
Here we iterate over all entries in the database, and visualize a few images if necessary. It is a good way to check if the images are processed correctly.