Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 00:24:24 GMT


Sponsored Links
 I need to insert approximately one million rows of data (spread over 4
tables) - each row has one or two numeric fields, and two or three
text fields (normally quite short). Single-column indexes on all the
fields.

After doing some tests on the emulator with a small test set, I
extrapolated (assuming the last row will be inserted with a similar
speed to the first row) my results to figure out that this would take
about 15 hours (I have a Core 2 Duo running Vista).

However, then I tried running the small test set on my Hero and I was
surprised to see it run more than four times faster. I would guess it
would take about 3 hours on the Hero.

Alternatively, I could make the pre-populated database available as a
download on the first run but this is likely to be a 120MB download.

I have a few questions I'm hoping someone can help me with:

1. SQLite performance on the actual device being 4-5 times faster
compared to the emulator - is this normal/expected?
2. Do the figures above generally sound fast/slow/normal (sorry this
is so vague)?
3. From a user perspective - would it be best to provide a 120MB
download or a 3 hour "first-time initialisation" step (plus 6MB
download)?
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 01:25:06 GMT


 Its a bilingual dictionary. The source data is a 2.5MB flat file once
compressed. The database stores this data plus some tables to assist
lookups. Most of the size comes from the db indexes.

I've just done some more tests and I see that after removing the
indexes the db size is more like 40MB (the indexes can be created
pretty quickly on the device).

And I originally forgot about compression which reduces the size to
almost a third the original. So the options are now:

1. 15MB database download then build indexes on the device
2. 2.5MB flat file download then perform the lengthy insert statements
- maybe I can get this down well below the 3 hours estimate but I
don't know...

Would a 15MB download be too unreasonable?




--~--~---------~--~----~------------~-------~--~----~


Sponsored Links


Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 01:49:09 GMT


 


Actually, I don't. I looked at it but it seemed to be related to
ContentProvider and ContentResolver neither of which I'm using. Am I
missing something here?

I use execSQL("INSERT INTO my_table VALUES(?,?,?)", new Object[]
{1,2,3}) and perform on average 10 calls per transaction.
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by lucky4me » Thu, 03 Sep 2009 02:19:30 GMT


 In addition to skink post - try to use prepared/compiled statements:

Object[] params = new Object[3];

SQLiteStatement statement = compileStatement("INSERT INTO my_table
VALUES(?,?,?)");
// loop here
for(...) {
  params[0] = ...
  params[1] = ...
  params[2] = ...
  DatabaseUtils.bindObjectToProgram(statement, 1, params[0]);
  DatabaseUtils.bindObjectToProgram(statement, 2, params[1]);
  DatabaseUtils.bindObjectToProgram(statement, 3, params[2]);
  statement.execute();

}





--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 03:01:40 GMT


 


i agree with Dianne, if you have string key/value data and #rows is
quite large (hundrets of thousands) typical sqlite db would be huge -
with simple gzip compression you could have your data 1/3 smaller

pskink
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 04:05:51 GMT


 


did you try allocating params array once?

Object[] params = new Object[3];

// loop here
for(...) {
params[0] = ...
params[1] = ...
params[2] = ...
execSQL("INSERT INTO my_table VALUES(?,?,?)", params)
}
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 08:45:59 GMT


 @Dianne - my assumption was that since all of the fields in the flat
file need to be searchable, then a database would be the best
approach. I'm wondering how this can be done without a database. Maybe
a file for each index - each entry consisting of key-value pairs. For
performing a lookup on the index, do some kind of binary search. Is
this what you mean?

@skink, @lucky4me - thanks for the optimization tips. I'll try those
out. I was kind of assuming that the lion's share of the CPU usage was
being done in the DB so I haven't even started on the java-side of the
optimization. I'll post some new results soon.

@skink - the gzip compression would only be useful for when
downloading the data, wouldn't it? After installation/first-time-
initialisation, wouldn't all the data be best stored uncompressed?





--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Thu, 03 Sep 2009 12:00:45 GMT


 I've done some speed tests and here are my findings (disclaimer, my
test runs for just over a minute so timings are not extremely
reliable):

1. Reusing SQL params object array - negligible speed improvement -
test too short to say anything with confidence
2. Using prepared statements rather then execSQL() - 20% faster  (BTW
- there is no need to use an array at all here)
3. Removing log statements - 100% faster (whoops!)
4. Inserting 10 times as many rows (speed (per row average) stays
about the same)
5. Running on the Hero instead of the emulator - 400%-500% faster

There are quite a few other objects being created within the main loop
so maybe there are some speed improvements to be made. However, after
experimenting with point 1, I don't expect these to be as dramatic as
points 2,3 and 5.

So, assuming I stick with the database approach, the options are:

1. 15MB database download
2. 2.5MB download then perform a one-hour initialisation step

I can see that second option is a bit weird...
--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by skink » Thu, 03 Sep 2009 14:26:16 GMT


 


well, if both key-value fields need to be searchable as you said, you
are right: compression here is not a good option....

--~--~---------~--~----~------------~-------~--~----~



Inserting large volumes of data into SQLite - speed issues

by mjc147 » Fri, 04 Sep 2009 12:44:52 GMT


 I performed the complete import which took just over one hour on the
Hero. Final db size (no indexes) was 24MB and zips to 11MB.

I had to build the db in the default location (i.e. not on the sdcard)
because it was much much faster. Once the import was finished, I
simply copied the db over to the sdcard.

So this db zip file would be downloaded when a user first runs the
app. The db file would be automatically uncompressed and the indexes
built.

How does this sound?

I would still like to try out an alternative to using SQLite, as
Dianne suggested. Can anyone give me any pointers, please?
--~--~---------~--~----~------------~-------~--~----~



Other Threads

1. GridView

Hello all,

I have a question about grid view: is there any chance to stretch
GridView vertically. What I'm trying to do is to create a screen with
a limited number of items (6). I need exactly 2 columns and 3 rows (or
3x2 in landscape mode) to fill all the screen.

In a grid view I can control columns number, and columns stretching
mode, but not rows. I'm looking into overriding
AbsListView.generateLayoutParams() method, but want to ask first.
Probably, there is some simpler way available.

Thank you in advance.
--~--~---------~--~----~------------~-------~--~----~

2. Timing an ASyncTask

Hi all.

I'm running a network service within an ASyncTask. I want to be able
to time the task, and after a certain period of time interrupt it.

Is there a simple way to do this? Basically, when the doInBackground()
methods starts, I want to say "If it hasn't completed in 30 seconds,
do something else".

Thanks.

-- 

3. Strange V4L2 Camera Issue

4. Porting Android 2.1 to a small device with 1.4" 160x160 screen

5. When to add child views to a compound control?

6. How to programatically refresh Gallery View?

7. How to add a unique label on the OverlayItems?