How to Hook Up lmdb with Caffe using Python!
Long story short, here’s how I figured out how to interact with lmdb using Python.
First, a bit of setup:
You’ve probably noticed two unfamiliar packages
caffe.proto package defined a lot of things, but here we are only using the data structure
Datum, you can think of it as an intermediate form between our images and labels and lmdb entries.
caffe.io package has this two helper functions
datum = datum_to_array(X, y) and
X = array_to_datum(datum), which could save us some time defining the structure of our
Datum object. Note that
y is not returned by
array_to_datum, you can simply call
datum.label to get it.
Function to write to lmdb:
This function takes a folder path
img_dir as input and push all the images in the folder into a lmdb database specified by
map_size is the capacity of the database. In my case I have total of
len(files) image patches of size
64*64*3, and I used a factor of 2 just in case.
I’ve also set
y=1 for all images because currently I don’t have labels yet. You will need to make changes according to your situation.
Function to read from lmdb:
Here we iterate over all entries in the database, and visualize a few images if necessary. It is a good way to check if the images are processed correctly.
PhD student at USC working on computer vision.