.NET Forum / Languages / C# / March 2008
How to implement my array/collection
|
|
Thread rating:  |
Maddy - 18 Mar 2008 10:35 GMT Hi, I have just migrated to C#. I need some advice on the following problem: I need to create an object/structure with characteristics A,B and C. A and B are Integers while C consists of 2 Integers. I need to store data in these, every 5 sec or so, each value with a DateTime stamp/ index. I will need a method to write the value with the DateTime. Another method to get the value for a specific DateTime. However, for getting the value, there might not be a corresponding DateTime index. In that case, the value closes to that time should be provided. What is the best method of implementing this? I would require ten instances of this object at any time. PLease advice. Thanks Maddy
Marc Gravell - 18 Mar 2008 11:06 GMT Well, 10 is trivial... to keep things simple I would probably just throw the timstamp on as a property, and use LINQ to find the closest (via the absulute time delta); just add to the end of the list, and when you have too many (11) just recycle (remove the front item, or [trickier] round-robin over the values)
using System; using System.Collections.Generic; using System.Linq; class Foo { public DateTime Timestamp { get; set; } public int A { get; set; } public int B { get; set; } private readonly Bar c = new Bar(); public Bar C { get { return c; } } } class Bar { public int X { get; set; } public int Y { get; set; } } static class Program { static void Main() { var list = new List<Foo> { new Foo { Timestamp = DateTime.Today, A = 1, B = 2, C = {X = 3, Y = 4}}, new Foo { Timestamp = DateTime.Today.AddDays(1), A = 5, B = 6, C = {X = 7, Y = 8}} }; DateTime key = DateTime.Now; Foo item = list.OrderBy(foo => Math.Abs((foo.Timestamp - key).Ticks)).FirstOrDefault(); if (item != null) { Console.WriteLine("{0}: {1}, {2}; {3}, {4}", item.Timestamp, item.A, item.B, item.C.X, item.C.Y); } } }
Maddy - 20 Mar 2008 03:28 GMT Thanks for the ideas. Few things however- When I said only 10 objects, I meant the entire object. A,B and C would have about 400 to 500 values. Would have so many values in memory slow down the app? Should I save them more often to a file? Secondly each A, B and C need an individual timestamp.
Marc Gravell - 20 Mar 2008 06:28 GMT > Thanks for the ideas. Few things however- > When I said only 10 objects, I meant the entire object. A,B and C > would have about 400 to 500 values. Would have so many values in > memory slow down the app? Should I save them more often to a file? > Secondly each A, B and C need an individual timestamp. OK - can I just double check what you are describing? In terms of a visualisation, do you mean:
SomeList [~10 of]
> "A"s > int, timstamp pair [~500 of]
> "B"s > int, timestamp pair [~500 of]
> "C"s > int,int,timestamp triple [~500 of]
or do you mean:
SomeList [~10 of]
> Items [~500 of] > timestamp, int "A", int "B", {int/int} "C"
- i.e. are A/B/C timstamped separately, or is it each combination of an A,B & C that is timstamped.
Also 500 is not a big number. I wouldn't worry about memory - but how often do you need to seach it for a value (by date)? knowing that might help pick the most suitable structure. If it is infrequent (i.e. 5s, the only timing in your post) then I wouldn't get excited, and would just use a List search (as already posted). [In any event, keep the data in timestamp sequence] If it is more frequent than this, then perhaps a binary search - or if you know the data is coming in at regular intervals (every 5s) you could use interpolation of the query value between the start/end values to find a first guess and then hunt from here.
Let me know how often you search and I'll try to knock an example together...
Marc
Marc Gravell - 20 Mar 2008 09:15 GMT Here's an example of an interpolated search... there might be easier ways, but it appears to work very well if the data is approximately linear (i.e. the keys are coming in at a fairly regular spacing - they don't have to be exactly spaced).
Note that it doesn't recurse - it takes a stab using interpolation and then walks the rest of the way from there. My reasoning is: if the first guess is fairly close, then we haven't got far to go anyway; if the first guess is wrong, then it means that the list is wildly non-linear, so there is no reason to suspect that another interpolation will be any better - may as well just treck...
You wouldn't keep the [On]Performance event/method in production code - this is just to prove that it hasn't had to walk very far, even for long lists. With neatly spaced data like above, I get "1" for a value that is an exact match (i.e. we jumped to the item on our first guess) and "2" otherwise (meaning it had to check a second value). Not bad.
using System; using System.Collections.Generic;
static class Program { static void Main() { LinearList<DateTime, Foo> list = new LinearList<DateTime, Foo>(Interpolate); list.Performance += new EventHandler<CounterEventArgs>(list_Performance); // lots of sample data... // note that it is *roughly* linear (doesn't have to be exact) Random rand = new Random(987654); DateTime lookForThis = DateTime.MaxValue; for (int days = -350; days < 350; days++) { DateTime key = DateTime.Today.AddDays(days). AddMinutes(rand.Next(-600, 600)); if (days == 100) { lookForThis = key; } list.Add(key, new Foo { A = days }); } Foo foo1 = list.FindNearestValue(DateTime.Today.AddHours(-6)), foo2 = list.FindNearestValue(DateTime.Today.AddHours(6)), foo3 = list.FindNearestValue(DateTime.Today.AddHours(18)), foo4 = list.FindNearestValue(lookForThis); }
static void list_Performance(object sender, CounterEventArgs e) { Console.WriteLine("OpCount: {0}", e.Counter); }
public static double Interpolate(DateTime min, DateTime value, DateTime max) { long minTicks = min.Ticks, range = max.Ticks - minTicks; double offset = value.Ticks - minTicks;
return range == 0 ? 0 : offset / range; } } class Foo {
public int A { get; set; } public int B { get; set; } private readonly Bar c = new Bar(); public Bar C { get { return c; } } } class Bar { public int X { get; set; } public int Y { get; set; } } public class LinearList<TKey, TValue> : SortedList<TKey, TValue> { private readonly Func<TKey, TKey, TKey, double> interpolation; public LinearList(Func<TKey, TKey, TKey, double> interpolation) : this(interpolation, null) { } public LinearList(Func<TKey, TKey, TKey, double> interpolation, IComparer<TKey> comparer) : base(comparer ?? Comparer<TKey>.Default) { if (interpolation == null) throw new ArgumentNullException("interpolation"); this.interpolation = interpolation; }
private int ThisOrNext(int index, TKey value) { TKey left = Keys[index]; if (Comparer.Compare(value, left) <= 0) return index; TKey right = Keys[index + 1]; if (Comparer.Compare(value, right) >= 0) return index + 1; double factor = interpolation(left, value, right); if (factor < 0 || factor > 1) { throw new InvalidOperationException("Interpolation expected in interval [0,1] since value is in range"); } return factor <= 0.5 ? index : index + 1; } public TValue FindNearestValue(TKey key) { int index = FindNearestIndex(key); return index < 0 ? default(TValue) : this.Values[index]; } public int FindNearestIndex(TKey key) { int count = Count, opCount = 1; try { switch (count) { // trivial cases case 0: return -1; case 1: return 0; case 2: return ThisOrNext(0, key); } TKey lowerBound = this.Keys[0], upperBound = this.Keys[count - 1]; IComparer<TKey> comparer = Comparer; // check if it goes outside [or is on boundary of] our range (return min/max) if (comparer.Compare(key, lowerBound) <= 0) return 0; if (comparer.Compare(key, upperBound) >= 0) return count - 1;
double factor = interpolation(lowerBound, key, upperBound);
int testIndex = (int)Math.Round((count - 1) * factor), compare = comparer.Compare(key, this.Keys[testIndex]); if (compare == 0) { return testIndex; // hit! } else if (compare < 0) { // we guessed too high... look lower for (int i = testIndex - 1; i >= 0; i--) { opCount++; compare = comparer.Compare(key, this.Keys[i]); if (compare == 0) return i; // hit! if (compare > 0) { return ThisOrNext(i, key); } } return ThisOrNext(0, key); } else // compare > 0 { // we guessed too low... look higher for (int i = testIndex + 1; i < count - 1; i++) { opCount++; compare = comparer.Compare(key, this.Keys[i]); if (compare == 0) return i; // hit! if (compare < 0) { return ThisOrNext(i - 1, key); } } return ThisOrNext(count - 2, key); } } catch { opCount = 0; throw; } finally { if (opCount > 0) OnPerformance(opCount); } } // give an indication of how good/bad our search was... public event EventHandler<CounterEventArgs> Performance; protected virtual void OnPerformance(int opCount) { if (Performance != null) Performance(this, new CounterEventArgs(opCount)); } } public sealed class CounterEventArgs : EventArgs { private readonly int counter; public int Counter {get {return counter;}} public CounterEventArgs(int counter) { this.counter = counter; } }
Maddy - 22 Mar 2008 04:43 GMT > - i.e. are A/B/C timstamped separately, or is it each combination of > an A,B & C that is timstamped. Yes, A/B/C are timestamped seperately, ie each A,b and C have their individul timestamps. I generally need to search every 5-10sec. The data is also generally coming in at a regular interval of once very second. Beyond how many values is it better to read and write from a file? Is it possible to read upto a certain number from memory and if older values are required, read from a file, ie assuming I keep saving the older values from memory to file? Thanks.
Marc Gravell - 22 Mar 2008 10:34 GMT To be honest, if you are thinking of using disk storage, and the rate is only every 5-10s, why not just pump everything into a database and use SQL to do it? It would be a lot simpler, and you wouldn't have to worry about the difference between in memory and persisted. SQL Express is free and more than up to this - blus LINQ-to-SQL should be able to cope with expressions like the one in my early posts.
I still don't really understand the A/B/C mapping...
Marc
Maddy - 24 Mar 2008 10:04 GMT > To be honest, if you are thinking of using disk storage, and the rate > is only every 5-10s, why not just pump everything into a database and [quoted text clipped - 6 lines] > > Marc Thanks for all the inputs. I think instead of A,B and C, I will just tell you my requirement. I am modelling ships whose navigation data is to be recorded and certain operations carried out on them. Course/ heading every second, speed every second and position (latitude and longitude) every 5 sec. The position may either come from external source or I might need to read the course and speed to calculate the position. Every 5-10 sec, I would be reading the course/speed / position data for certain other calculations. There will be around 10-15 ships whose data needs to be recorded and read simulataneously at any time. If I make it fully disk access, will SQL express be required or can I use a less complicated solution? Thanks, Nitin
Marc Gravell - 24 Mar 2008 10:50 GMT I'll re-read on Tuesday (long weekend) - but in real terms I would consider "install SQL Express" a very simple option...
Marc Gravell - 25 Mar 2008 09:23 GMT How much time do you need to consider at once? I wouldn't stress about thousands of samples, /especially/ if you are using an interpolated search (as it will be pretty-much instantaneous to get values - roughly O(1) as far as I can tell). Obviously you need to think about upper-bounds on memory, but you should be able to test worst-case etc by simple experiments.
As for the class architecture, what you are saying just means that rather than one LinearList, you might have 3 (per ship modelled):
class ShipHistory { // not necessarily decimal... private readonly LinearList<DateTime, decimal> headingHistory, speedHistory; // not necessarily PointF... private readonly LinearList<DateTime, PointF> positionHistory; // TODO: init lists in ctor or inline }
Then when talking about DateTime foo just get the nearest known (FindNearestValue) of each.
The alternative model would be to (when *any* input comes in) stamp all 3 in a single record, but this relies on the inputs being fairly synchronized. The above (separate logs) model will work better if you are assembling logs asynchronously or after-the-event, and wish to interpolate between them etc.
Marc
Ignacio Machin ( .NET/ C# MVP ) - 18 Mar 2008 13:03 GMT Hi,
> Hi, I have just migrated to C#. I need some advice on the following > problem: [quoted text clipped - 9 lines] > Thanks > Maddy Create a class with the members as you described above and then inside the same class declare a static List<your type>, you also declare a couple of search methods (also static) Take a look at this pseudo code:
class C { public int A public int B } class F { public int A; public int B; public C C;
public static List<F> list = new ...; public static F SearchByA(int a) { foreach(F f in list) if ( f.A == a) return f; }
}
Free MagazinesGet these publications absolutely FREE for up to 12 months. There are no hidden fees and no obligation. Simply choose a title, complete the application form and submit it. Read more ...
|
|
|