Removing items in an arraylist from another arraylist in VB.NET
I am writing a VB.NET webforms site, one page of which has to load a list
of files into a listbox. It needs to load all PDF and TIF files in a
directory that do not have entries in a database. I am doing this
successfully at the moment with the following code. Basically, I query the
database to get an arraylist of filename entries, then go through each
file in the directory, check its name against each entry in the arraylist,
and if its name is not in the arraylist, add it to list to bind to the
listbox:
Dim category As String = "RFQ"
'Initialize database connection variables
Dim sql As String
Dim query As System.Data.SqlClient.SqlCommand
Dim result As System.Data.SqlClient.SqlDataReader
'Load document list from database
Dim savedfiles As New ArrayList
database.Open() 'Open connection to database
sql = "SELECT filename FROM fileheaders WHERE [category] = '" &
category & "'" 'SQL query to read file header information
query = New System.Data.SqlClient.SqlCommand(sql, database) 'Create
query to send to database
result = query.ExecuteReader() 'Execute query
While result.Read()
savedfiles.Add(row(result, "filename"))
End While
result.Close()
dbDocscan.Close()
'The following code section pulls all files from the current file
directory.
Dim filelist = New ArrayList
Dim dir As New System.IO.DirectoryInfo(dirName) 'Get directory
information
Dim files As System.IO.FileInfo() = dir.GetFiles() 'Get all files in
directory
Dim file As System.IO.FileInfo
Dim i As Integer = 0
For Each file In files
If ((file.Extension Like ".pdf") Or (file.Extension Like ".tif"))
And Not inArray(savedfiles, file.Name) Then
filelist.Add(file.Name) 'Add .pdf and .tif files to list of
documents
End If
Next
filelist.TrimToSize()
eltFilelist.DataSource = filelist
eltFilelist.DataBind() 'Bind document list to listbox
Then the inArray function code:
Function inArray(arr As ArrayList, str As String) As Boolean
For Each item In arr
If TypeOf (item) Is String Then
If str = item Then
Return True
Exit Function
End If
End If
Next
Return False
End Function
Here's the problem: while it works, it seems terribly inefficient. There
are around 27,000 files in the directory and around 26,000 file entries in
the database. So I am checking each of 27,000 filenames against a list of
26,000 names. Without making this into a combinatorics problem, that's
hundreds of millions of string matching statements. Is there a more
efficient way to go about this?
No comments:
Post a Comment