1. Beatrix Willius
  2. Valentina Database ADK
  3. 木, 6月 28 2018, 04:32 AM
  4.  メールで購読
Hi,

how can I speed up the code below which I use to check if a record is already in the database? Instruments tells me that the app spends about 10% of processing time in this method. For historical reasons (a.k.a stupidity) the method is a bit of a mess. In some cases I forgot to add the brackets around the message id. The vField is cached.


Public Sub Constructor(theMessageID as String, InternalMessageIDField as vField)

if theMessageID = "" or InternalMessageIDField = nil then
theResult = true
Return
end if

dim theCheckset as VArraySet
if left(theMessageID, 1) <> "<" and Right(theMessageID, 1) <> ">" then
theCheckset = InternalMessageIDField.FindValueAsArraySet(Left("<" + theMessageID + ">", 100))
if theCheckset = nil then
theResult = False
else
dim i as Integer = theCheckset.Count
if i = 0 then
theResult = False
else
theResult = true
end if
end if

if theResult = true then Return 'already found, if not try without brackets, AppleScript gives back the message id without brackets for Mail
theCheckset = nil

end if

theCheckset = InternalMessageIDField.FindValueAsArraySet(Left(theMessageID, 100))
if theCheckset = nil then
theResult = False
else
dim i as Integer = theCheckset.Count
if i = 0 then
theResult = False
else
theResult = true
end if
end if

if theResult = true then Return'already found,
'if not try with right bracket, found some cases where data has been entered with a right bracket only
theCheckset = nil
theCheckset = InternalMessageIDField.FindValueAsArraySet(Left(theMessageID + ">", 100))
if theCheckset = nil then
theResult = False
else
dim i as Integer = theCheckset.Count
if i = 0 then
theResult = False
else
theResult = true
end if
end if

End Sub

Oh, and after updating to Valentina 8.3.3 the code is a couple of seconds slower compared to 7.5.9.

Xojo 2018r1.1. Valentina 8.3.3, Mojave latest beta.



Regards

Beatrix Willius
コメント
There are no comments made yet.
Ruslan Zasukhin 承諾済みの回答
Hi Beatrix,

You can consider the usage of methods

VField.ValueExists()

VField.FindSingle()
コメント
There are no comments made yet.
Beatrix Willius 承諾済みの回答
Thanks, didn't know about these functions. I'll have a look.

Regards

Beatrix Willius
コメント
There are no comments made yet.
Beatrix Willius 承諾済みの回答
FindSingleValue isn't any faster than using an ArraySet.

1. Write first time: Get MessageID from Mail, check if MessageID is in database (new database), get full data from Mail, write to database.
2. Write second time: get MessageID from Mail, check if MessageID is in database (existing database), nothing more to do.

ArraySet FindSingleValue
1 21 25
2 7 8

All values are in seconds. The archival set is 300 mails or so. I tested a couple of runs to verify the data.
コメント
There are no comments made yet.
Ruslan Zasukhin 承諾済みの回答
well, maybe you can ZIP archive of test db AND give that query so we can test it from VStudio.
コメント
There are no comments made yet.
Beatrix Willius 承諾済みの回答
For testing I've done a dictionary approach:


if MessageIDs = Nil then MessageIDs = new Dictionary

if MessageIDs.HasKey(theMessageID) then
theResult = True
else
MessageIDs.Value(theMessageID) = theMessageID
end if


The improvement in speed is very nice. Of course, I haven't done any checking for the correct database. And for existing databases I would have to read all message ids.

Would access to the field be faster if I copied the field into it's own table?

Regards

Beatrix Willius
コメント
There are no comments made yet.
Ruslan Zasukhin 承諾済みの回答
For INFO:

1) Valentina DB is columnar, this means that if you SCAN a single field it not "feals" other columns of the table. So generally speaking no sense move column into separate Table. Not in Valentina DB.

2) Question is how to SCAN field in a way, that other fields are ignored. In V4RB this can do VCursor. You should "SELECT fld FROM T" -- only one that field, then iterate cursor in loop reading values. From disk will be loaded ONLY THAT ONE field.

VField.FindXXX() methods also should do effective SCAN of a column.
コメント
There are no comments made yet.
Beatrix Willius 承諾済みの回答
1) Thanks for the information.

2) Had the idea with the cursor myself so that the data is read only once. VField.Findxxx is slow compared to the dictionary method.

FindArraySet: 11 minutes 36 seconds.
FindSingleValue: 12 minutes 28 seconds.
Dictionary in Xojo: 7 minutes 48 seconds.

Empty database, after adding data the database is 180 MB. Just by changing the method of duplicate checking I got a huge speed increase!
コメント
There are no comments made yet.
Ivan Smahin 承諾済みの回答
What about to make InternalMessageID field indexed?
コメント
There are no comments made yet.
Beatrix Willius 承諾済みの回答
@Ivan Smahin: indexing the message id field makes everything even slower.
コメント
There are no comments made yet.
  • ページ :
  • 1


There are no replies made for this post yet.
However, you are not allowed to reply to this post.

Categories

Announcements & News
  1. 0 subcategories
Valentina Studio
  1. 2 subcategories
Valentina Server
  1. 4 subcategories
Valentina Database ADK
  1. 0 subcategories
Valentina Reports ADK
  1. 0 subcategories
Other Discussions
  1. 2 subcategories
BETA Testing
  1. 0 subcategories
Education & Research
  1. 0 subcategories