Home | Contact Us | FAQ | Search & Site Map | Link to Us
Sign In | Join | Other 45 Sites in Network
HomeAnnouncementsFree MagazinesWhite PapersSubmit Content
Discussion GroupsASP.NETWindows FormsLanguages.NET FrameworkVisual Studio.NET
Articles.NET FrameworkASP.NETToolsWindows Forms
.NET DirectoryOpen Source ProjectsUser GroupsWeb Resources
Related Topics
Visual Basic 6SQL ServerMS AccessOther DB ProductsMS Server ProductsMore Topics ...

.NET Forum / Languages / C# / March 2008

Tip: Looking for answers? Try searching our database.

Designing a log search application

Thread view: 
Enable EMail Alerts  Start New Thread
Thread rating: 
Bruce - 15 Mar 2008 21:04 GMT
Hello
I am designing an app to do log search efficiently. I have gigabytes
of server logs that contain all kind of information - typically I
query about a user name in a certain time span to find out what the
user actually did during that time, what errors he got etc.
I previously just used findstr across these files to do it but I am
finding it slow and inaccurate.
So, I am planning to write a platform that parses all the logs
realtime, stores the words in a database.
For example  a line in the log reading "User connected to server"
would result in 4 rows in the database for each of the words, with
information about the file, time, relative location in the log among
other things.
This way, if I query for "bruce connected', I would be able to convert
it into a database query and fetch the results fairly quickly.

I have a couple of questions:
1. I am not using any standard search engine since I don't think they
index and provide the level of detail I would need. So, does my design
of using a database in the above manner sound good?
2. On top of this platform, I plan to build layers that do intelligent
search - say using business logic, it queries and finds out all users
who got errors and displays them in a UI.

I am curious to know whether there is a better approach to this.

Thanks
Bruce
Jon Skeet [C# MVP] - 15 Mar 2008 21:46 GMT
> I am designing an app to do log search efficiently. I have gigabytes
> of server logs that contain all kind of information - typically I
[quoted text clipped - 16 lines]
> index and provide the level of detail I would need. So, does my design
> of using a database in the above manner sound good?

It sounds like you're ignoring the fact that the database itself is
probably good at doing full text indexing. Why not have one row per log
entry, and let the database handle the complicated stuff?

Alternatively, if you're really just building an index, consider using
a product built specifically for indexing, such as Lucene.

> 2. On top of this platform, I plan to build layers that do intelligent
> search - say using business logic, it queries and finds out all users
> who got errors and displays them in a UI.
>
> I am curious to know whether there is a better approach to this.

That part sounds perfectly reasonable - although again, it may well be
easier if you have one entry per log entry, with separate columns for
the nature of the entry (error, info etc), the affected user, etc.

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk

Bruce - 15 Mar 2008 23:41 GMT
> > I am designing an app to do log search efficiently. I have gigabytes
> > of server logs that contain all kind of information - typically I
[quoted text clipped - 37 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet  Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk

Thanks for the reply Jon.
>Why not have one row per log entry, and let the database handle the complicated stuff?
Won't the search get less efficient if I do this?
If the row reads "Bruce sent a message" and I search for Bruce, my
query would be like "Select xx from xxx where logcolumn like "% Bruce
%"". Isn't doing the "like" inefficient since the database would have
to do a full table scan?
(whereas if I just had the word, the index can directly get me the
row. Multiple words would require joins though)
Also, I would not know from a log entry what/where the user name is .

Thanks
Bruce
Jon Skeet [C# MVP] - 16 Mar 2008 00:06 GMT
<snip>

> >Why not have one row per log entry, and let the database handle the complicated stuff?
> Won't the search get less efficient if I do this?
[quoted text clipped - 4 lines]
> (whereas if I just had the word, the index can directly get me the
> row. Multiple words would require joins though)

Look into full text indexing. A quick look at the docs for SQL Server
2005 suggest that you'd use a "CONTAINS" clause - but I haven't
actually *used* full text indexing myself, so it's new to me too :) I
just know that various databases support it, and it avoids you having
to reinvent the wheel. (I believe it also supports cunning stuff like
word stemming, and possibly even coping with typos etc.)

> Also, I would not know from a log entry what/where the user name is .

Is there some way of scanning for known entry patterns when you're
putting the entries into the database, and fetching really useful
information like that once, rather than searching multiple times?

Signature

Jon Skeet - <skeet@pobox.com>
http://www.pobox.com/~skeet   Blog: http://www.msmvps.com/jon.skeet
World class .NET training in the UK: http://iterativetraining.co.uk

Bruce - 16 Mar 2008 00:49 GMT
> <snip>
>
[quoted text clipped - 23 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet  Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk

Full text indexing sounds pretty interesting, I took a look and it
seems like it does not happen automatically and I need to manually
build the index population. But I will investigate it further.

Regarding fetching multiple entries at once, I think it depends on
what kind of information users are searching for. I will keep the idea
in mind. Thanks for the input Jon.

Bruce
Bruce - 15 Mar 2008 23:42 GMT
> > I am designing an app to do log search efficiently. I have gigabytes
> > of server logs that contain all kind of information - typically I
[quoted text clipped - 37 lines]
> Jon Skeet - <sk...@pobox.com>http://www.pobox.com/~skeet  Blog:http://www.msmvps.com/jon.skeet
> World class .NET training in the UK:http://iterativetraining.co.uk

Thanks for the reply Jon.
>Why not have one row per log entry, and let the database handle the complicated stuff?
Won't the search get less efficient if I do this?
If the row reads "Bruce sent a message" and I search for Bruce, my
query would be like "Select xx from xxx where log like "% Bruce %"".
Isn't doing the "like" inefficient since the database would have to do
a full table scan?
(whereas if I just had the word, the index can directly get me the
row. Multiple words would require joins though)
Also, I would not know from a log entry what/where the user name is to
have a separate column for it.

Thanks
Bruce

Rate this thread:







Free Magazines

Get these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...

Oracle MagazineNetwork ComputingComputer WorldBio-IT WorldeWeekInformation WeekInfosecurity
 
Sign In
Join
My Latest Posts
My Monitored Threads
My Blog
My Photo Gallery
My Profile
My Homepage

Start New Thread
Enable EMail Alerts
Rate this Thread



©2008 Advenet LLC   Privacy Policy - Terms of Use
This website includes both content owned or controlled by Advenet as well as content owned or controlled by third parties.