.NET Forum / Languages / C# / October 2007
Which design pattern is good for this?
|
|
Thread rating:  |
ydbn - 25 Oct 2007 06:03 GMT I need to write a program validate a text file in CSV format. So I will have a
class DataType
and a lot of of derived class for various type, e.g. IntType, StringType, FloatType, MoneyType, ... etc.
For each column of a type, it may accept null/empty value. or not. It may have various max length for StringType, IntType,... etc.
And for each column, it may have certain range checking, like some column of IntType can only between 1 to 25. Some StringType column can only be certain values.....
Which design patter is best for this? A dictionary with decorate design pattern? sound too heavy....
namekuseijin - 25 Oct 2007 06:23 GMT > I need to write a program validate a text file in CSV format. So I will have a > [quoted text clipped - 12 lines] > Which design patter is best for this? A dictionary with decorate design > pattern? sound too heavy.... how about the WTF?! design pattern?
seriously, a better pattern is DRY: why implement the aforementioned classes when you could simply do, say, Int.Parse( text ) for a given chunk of text inside a try block?
Jon Skeet [C# MVP] - 25 Oct 2007 08:18 GMT > > Which design patter is best for this? A dictionary with decorate design > > pattern? sound too heavy.... [quoted text clipped - 4 lines] > classes when you could simply do, say, Int.Parse( text ) for a given > chunk of text inside a try block? Because it provides encapsulation of parsing and validation. Instead of having a giant switch statement (or something similar) the OP can define the columns, and then just keep calling Parse etc. Sounds reasonable to me.
Now, as for your suggestion: if you're going to try to parse something and catch exceptions, the TryParse methods are better than calling Parse inside a try block.
Jon
namekuseijin - 25 Oct 2007 19:25 GMT > Because it provides encapsulation of parsing and validation. Instead > of having a giant switch statement (or something similar) the OP can > define the columns, and then just keep calling Parse etc. Sounds > reasonable to me. a single method/function definition with the "switch statement" provides just enough encapsulation to the job at hand. Why waste time implementing several redundant classes?
> Now, as for your suggestion: if you're going to try to parse something > and catch exceptions, the TryParse methods are better than calling > Parse inside a try block. wow, somehow sounds like they do exactly that underneath...
seriously, a better pattern seems to be KISS...
Jon Skeet [C# MVP] - 25 Oct 2007 19:33 GMT > > Because it provides encapsulation of parsing and validation. Instead > > of having a giant switch statement (or something similar) the OP can [quoted text clipped - 4 lines] > provides just enough encapsulation to the job at hand. Why waste time > implementing several redundant classes? They're not redundant, IMO - they're encapsulating behaviour, and in a flexible way. Individual objects are then responsible for defining how a column behaves, in all aspects of parsing and validating.
Without separate objects for each column, where would you put rules for lengths, optional/mandatory values, potentially minimum/maximum values for numbers etc? Is that all going to be part of the giant switch statement too?
I have no problem with having many small classes, each doing a particular thing well. I far prefer that to having giant methods.
> > Now, as for your suggestion: if you're going to try to parse something > > and catch exceptions, the TryParse methods are better than calling > > Parse inside a try block. > > wow, somehow sounds like they do exactly that underneath... No, they don't. They avoid the exception being thrown in the first place.
> seriously, a better pattern seems to be KISS... You think try/catch/ignore expression is simpler than using a method which tells you whether or not the value was parsed correctly? I disagree.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
namekuseijin - 26 Oct 2007 21:08 GMT > They're not redundant, IMO - they're encapsulating behaviour, and in a > flexible way. Individual objects are then responsible for defining how > a column behaves, in all aspects of parsing and validating. my whole point was to point out that "IntType, StringType, FloatType, MoneyType" are all builtin types already, even with a handy Parse method!
that's why it's redundant.
> Without separate objects for each column, where would you put rules for > lengths, optional/mandatory values, potentially minimum/maximum values > for numbers etc? Is that all going to be part of the giant switch > statement too? the giant switch will likely be way shorter than implementing the useless, redundant classes for this one-shot problem.
Jon Skeet [C# MVP] - 26 Oct 2007 22:21 GMT > > They're not redundant, IMO - they're encapsulating behaviour, and in a > > flexible way. Individual objects are then responsible for defining how [quoted text clipped - 4 lines] > handy Parse method! > that's why it's redundant. None of them contain settings for allowing the name of the column, nullability, other validation etc.
That's part of what would be contained within the column definition, and some of that varies by type.
> > Without separate objects for each column, where would you put rules for > > lengths, optional/mandatory values, potentially minimum/maximum values [quoted text clipped - 3 lines] > the giant switch will likely be way shorter than implementing the > useless, redundant classes for this one-shot problem. There's more to elegant design than counting lines of code.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
namekuseijin - 27 Oct 2007 02:42 GMT > None of them contain settings for allowing the OP's original request was: "I need to write a program validate a text file in CSV format"
that is, given: name;age;salary; john;33;3400; jane;28;2500;
ensure that first column is string of given length, second is int in range and third is money. None of these constraints are provided by the CSV per se, but by the programmer, including the order.
an algorithm to process this file:
try { lines = file.Lines() lines.next() // drop first line: headers while (line = lines.next()) { getName( line ) getAge( line ) getSalary( line ) } } catch { "CSV file not ok" }
where, say, getAge could be: getAge( line ) { int age = Int.Parse( CSVcolumn( 1, line ) ) // may throw an exception right away if !(age between min and max) throw exception }
this is a lot more useful and simple than implementing whole classes for such a trivial and one-shot task... KISS
Jon Skeet [C# MVP] - 27 Oct 2007 09:20 GMT > > None of them contain settings for allowing > [quoted text clipped - 9 lines] > range and third is money. None of these constraints are provided by > the CSV per se, but by the programmer, including the order. Absolutely - the programmer can put in the constraints with the types. All they need to do is create the column definitions once (which could easily be done in something like a Spring configuration file) and then call a method which can parse any file given the column definitions.
There's no need to hard code everything.
> getName( line ) > getAge( line ) > getSalary( line ) So is each of these methods going to split the line? I'd rather split the line once, and act on each element separately.
Apart from anything else, that's also a lot easier to test.
> this is a lot more useful and simple than implementing whole classes > for such a trivial and one-shot task... KISS Yes, 'cos we all know that CSV files never change format... I believe my solution would be just as simple, but much more flexible.
 Signature Jon Skeet - <skeet@pobox.com> http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet If replying to the group, please do not mail me too
Jon Skeet [C# MVP] - 25 Oct 2007 08:22 GMT > I need to write a program validate a text file in CSV format. So I will have a > [quoted text clipped - 5 lines] > For each column of a type, it may accept null/empty value. or not. It may > have various max length for StringType, IntType,... etc. So for a given file, you'll have a list of column definitions, each containing a parser and a validator, correct? I'd imagine it *may* be worth combining the parsing and validation - I wouldn't have thought there'd be many cases where the validator can be used with lots of different parsers, for instance.
> And for each column, it may have certain range checking, like some column of > IntType can only between 1 to 25. Some StringType column can only be certain > values..... > > Which design patter is best for this? A dictionary with decorate design > pattern? sound too heavy.... I can't see how a decorator would fit in here. I'd just define an appropriate interface, and then (once) create a list of column definitions for your CSV file, each of which implements the interface. Then either do the splitting at the "top level" and parse each part, or allow each parser to "take" however much data they need from the line, from a given position, returning how much source data they've used up, and the resulting data. The column definitions themselves should be immutable, unchanged by the process of parsing an entry - that way they stay reusable.
Now, what do you need to *do* with the data when you've got it? That will dictate the design of how the results are stored.
Jon
jehugaleahsa@gmail.com - 27 Oct 2007 16:27 GMT > I need to write a program validate a text file in CSV format. So I will have a > [quoted text clipped - 12 lines] > Which design patter is best for this? A dictionary with decorate design > pattern? sound too heavy.... A friend of mine had me implement a CSV/SSV/XML parser in terms of the IDataReader interface. It made the project he was working on a breeze.
It also gave him the ability to add column-specific constraints with a lot more ease. Putting the code in the IDataReader made the project so small and easy that 3 distinct parsers were done in a day's time.
The DataReader class has a base abstract class that allows you to specify how the data columns are parsed (this is the only "tricky" part). The base class provides intuitive conversions from the text file data to the requested type.
Personally, I treated the derived reader as a business object-like creature and create Properties for things like Name, Date, Company, Favorite Ice Cream which would retrieve the correct column and perform the correct data conversions and do constraint tests. Many people use IDataReader for their business objects - this is no different.
public DateTime Date { get { DateTime date = this.GetDate(1); // get date from text file 1st column // do checks on date return date; } }
I would more than love to send you my class if you are interested. However it is at work and you will need to wait till Monday or Tuesday. Just 'Reply to Author'.
Thanks, Travis
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|