> I am designing an app to do log search efficiently. I have gigabytes
> of server logs that contain all kind of information - typically I
[quoted text clipped - 16 lines]
> index and provide the level of detail I would need. So, does my design
> of using a database in the above manner sound good?
It sounds like you're ignoring the fact that the database itself is
probably good at doing full text indexing. Why not have one row per log
entry, and let the database handle the complicated stuff?
Alternatively, if you're really just building an index, consider using
a product built specifically for indexing, such as Lucene.
> 2. On top of this platform, I plan to build layers that do intelligent
> search - say using business logic, it queries and finds out all users
> who got errors and displays them in a UI.
>
> I am curious to know whether there is a better approach to this.
That part sounds perfectly reasonable - although again, it may well be
easier if you have one entry per log entry, with separate columns for
the nature of the entry (error, info etc), the affected user, etc.

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
Bruce - 15 Mar 2008 23:41 GMT
> > I am designing an app to do log search efficiently. I have gigabytes
> > of server logs that contain all kind of information - typically I
[quoted text clipped - 37 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk
Thanks for the reply Jon.
>Why not have one row per log entry, and let the database handle the complicated stuff?
Won't the search get less efficient if I do this?
If the row reads "Bruce sent a message" and I search for Bruce, my
query would be like "Select xx from xxx where logcolumn like "% Bruce
%"". Isn't doing the "like" inefficient since the database would have
to do a full table scan?
(whereas if I just had the word, the index can directly get me the
row. Multiple words would require joins though)
Also, I would not know from a log entry what/where the user name is .
Thanks
Bruce
Jon Skeet [C# MVP] - 16 Mar 2008 00:06 GMT
<snip>
> >Why not have one row per log entry, and let the database handle the complicated stuff?
> Won't the search get less efficient if I do this?
[quoted text clipped - 4 lines]
> (whereas if I just had the word, the index can directly get me the
> row. Multiple words would require joins though)
Look into full text indexing. A quick look at the docs for SQL Server
2005 suggest that you'd use a "CONTAINS" clause - but I haven't
actually *used* full text indexing myself, so it's new to me too :) I
just know that various databases support it, and it avoids you having
to reinvent the wheel. (I believe it also supports cunning stuff like
word stemming, and possibly even coping with typos etc.)
> Also, I would not know from a log entry what/where the user name is .
Is there some way of scanning for known entry patterns when you're
putting the entries into the database, and fetching really useful
information like that once, rather than searching multiple times?

Signature
Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk
Bruce - 16 Mar 2008 00:49 GMT
> <snip>
>
[quoted text clipped - 23 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk
Full text indexing sounds pretty interesting, I took a look and it
seems like it does not happen automatically and I need to manually
build the index population. But I will investigate it further.
Regarding fetching multiple entries at once, I think it depends on
what kind of information users are searching for. I will keep the idea
in mind. Thanks for the input Jon.
Bruce
Bruce - 15 Mar 2008 23:42 GMT
> > I am designing an app to do log search efficiently. I have gigabytes
> > of server logs that contain all kind of information - typically I
[quoted text clipped - 37 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk
Thanks for the reply Jon.
>Why not have one row per log entry, and let the database handle the complicated stuff?
Won't the search get less efficient if I do this?
If the row reads "Bruce sent a message" and I search for Bruce, my
query would be like "Select xx from xxx where log like "% Bruce %"".
Isn't doing the "like" inefficient since the database would have to do
a full table scan?
(whereas if I just had the word, the index can directly get me the
row. Multiple words would require joins though)
Also, I would not know from a log entry what/where the user name is to
have a separate column for it.
Thanks
Bruce