Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 00:24:24 GMT


Sponsored Links
 I need to insert approximately one million rows of data (spread over 4
tables) - each row has one or two numeric fields, and two or three
text fields (normally quite short). Single-column indexes on all the
fields.

After doing some tests on the emulator with a small test set, I
extrapolated (assuming the last row will be inserted with a similar
speed to the first row) my results to figure out that this would take
about 15 hours (I have a Core 2 Duo running Vista).

However, then I tried running the small test set on my Hero and I was
surprised to see it run more than four times faster. I would guess it
would take about 3 hours on the Hero.

Alternatively, I could make the pre-populated database available as a
download on the first run but this is likely to be a 120MB download.

I have a few questions I'm hoping someone can help me with:

1. SQLite performance on the actual device being 4-5 times faster
compared to the emulator - is this normal/expected?
2. Do the figures above generally sound fast/slow/normal (sorry this
is so vague)?
3. From a user perspective - would it be best to provide a 120MB
download or a 3 hour "first-time initialisation" step (plus 6MB
download)?
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 01:25:06 GMT


 Its a bilingual dictionary. The source data is a 2.5MB flat file once
compressed. The database stores this data plus some tables to assist
lookups. Most of the size comes from the db indexes.

I've just done some more tests and I see that after removing the
indexes the db size is more like 40MB (the indexes can be created
pretty quickly on the device).

And I originally forgot about compression which reduces the size to
almost a third the original. So the options are now:

1. 15MB database download then build indexes on the device
2. 2.5MB flat file download then perform the lengthy insert statements
- maybe I can get this down well below the 3 hours estimate but I
don't know...

Would a 15MB download be too unreasonable?




--~--~---------~--~----~------------~-------~--~----~


Sponsored Links


Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 01:49:09 GMT


 


Actually, I don't. I looked at it but it seemed to be related to
ContentProvider and ContentResolver neither of which I'm using. Am I
missing something here?

I use execSQL("INSERT INTO my_table VALUES(?,?,?)", new Object[]
{1,2,3}) and perform on average 10 calls per transaction.
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by lucky4me » Thu, 03 Sep 2009 02:19:30 GMT


 In addition to skink post - try to use prepared/compiled statements:

Object[] params = new Object[3];

SQLiteStatement statement = compileStatement("INSERT INTO my_table
VALUES(?,?,?)");
// loop here
for(...) {
  params[0] = ...
  params[1] = ...
  params[2] = ...
  DatabaseUtils.bindObjectToProgram(statement, 1, params[0]);
  DatabaseUtils.bindObjectToProgram(statement, 2, params[1]);
  DatabaseUtils.bindObjectToProgram(statement, 3, params[2]);
  statement.execute();

}





--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 03:01:40 GMT


 


i agree with Dianne, if you have string key/value data and #rows is
quite large (hundrets of thousands) typical sqlite db would be huge -
with simple gzip compression you could have your data 1/3 smaller

pskink
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 04:05:51 GMT


 


did you try allocating params array once?

Object[] params = new Object[3];

// loop here
for(...) {
params[0] = ...
params[1] = ...
params[2] = ...
execSQL("INSERT INTO my_table VALUES(?,?,?)", params)
}
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 08:45:59 GMT


 @Dianne - my assumption was that since all of the fields in the flat
file need to be searchable, then a database would be the best
approach. I'm wondering how this can be done without a database. Maybe
a file for each index - each entry consisting of key-value pairs. For
performing a lookup on the index, do some kind of binary search. Is
this what you mean?

@skink, @lucky4me - thanks for the optimization tips. I'll try those
out. I was kind of assuming that the lion's share of the CPU usage was
being done in the DB so I haven't even started on the java-side of the
optimization. I'll post some new results soon.

@skink - the gzip compression would only be useful for when
downloading the data, wouldn't it? After installation/first-time-
initialisation, wouldn't all the data be best stored uncompressed?





--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 12:00:45 GMT


 I've done some speed tests and here are my findings (disclaimer, my
test runs for just over a minute so timings are not extremely
reliable):

1. Reusing SQL params object array - negligible speed improvement -
test too short to say anything with confidence
2. Using prepared statements rather then execSQL() - 20% faster  (BTW
- there is no need to use an array at all here)
3. Removing log statements - 100% faster (whoops!)
4. Inserting 10 times as many rows (speed (per row average) stays
about the same)
5. Running on the Hero instead of the emulator - 400%-500% faster

There are quite a few other objects being created within the main loop
so maybe there are some speed improvements to be made. However, after
experimenting with point 1, I don't expect these to be as dramatic as
points 2,3 and 5.

So, assuming I stick with the database approach, the options are:

1. 15MB database download
2. 2.5MB download then perform a one-hour initialisation step

I can see that second option is a bit weird...
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 14:26:16 GMT


 


well, if both key-value fields need to be searchable as you said, you
are right: compression here is not a good option....

--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Fri, 04 Sep 2009 12:44:52 GMT


 I performed the complete import which took just over one hour on the
Hero. Final db size (no indexes) was 24MB and zips to 11MB.

I had to build the db in the default location (i.e. not on the sdcard)
because it was much much faster. Once the import was finished, I
simply copied the db over to the sdcard.

So this db zip file would be downloaded when a user first runs the
app. The db file would be automatically uncompressed and the indexes
built.

How does this sound?

I would still like to try out an alternative to using SQLite, as
Dianne suggested. Can anyone give me any pointers, please?
--~--~---------~--~----~------------~-------~--~----~



Other Threads

1. Fwd: CTS in Eclair

Hi All,

I have downloaded the latest eclair branch code, and I compared the python
scripts for test runner from cupcake branch, so i found few additions like
:-

--> many new test cases have been added.
--> the testrunner python scripts have been modified with a addition of new
folder named as "cts" suposed to be presnet in android top directory in
parallel with bionic, frameworks etc but teh code was not actualy present
there.

So xan somebody tell me how can i get the source code for that "cts"  ?? Did
i missed something while checking out the eclair sources ??

I followed the routine procedure to check out eclair branch. Please help me
with this

Thanks & Regards
Nimit

-- 

2. Increase alpha values of pixels on Bitmap/Canvas?

Is there a way to *increase* the alpha values of pixels on a Bitmap
with a Canvas?

To be more specific, to *decrease* the alpha I can do the following:

Paint clearPaint = new Paint();
clearPaint.setARGB(200, 255, 255, 255);
clearPaint.setXfermode(new PorterDuffXfermode
(PorterDuff.Mode.MULTIPLY));
...
canvas.drawPaint(clearPaint);

Effectively this only decreases the alpha value since it is multiplied
by (float) (200.0/255.0), and the color values remain unchanged since
they are multiplied by (float) (255.0/255.0), which is 1.

I now want to *increase* the alpha value on all pixels by a set amount
x. If it was OpenGL, I could draw a rectangle of color (x, 0, 0, 0),
having set the blending results to be (src * GL_ONE + dst * GL_ONE) on
all channels.

Is it possible to do this with some Xfermode?

-- 

3. Wndows View Properties.

4. Build your Android application in Chrome browser

5. modify the RecentTasks List

6. Display a route between 2 points on a map

7. Google Map not showing up....just the tiles