[Dev] btree deletes (Re: Data outside the box)David McCusker Mon, 02 Dec 2002 19:34:21 -0800
Jeremy Hylton wrote: > There's definitely a bit of culture shock talking to you, David :-). Sorry, I don't do it on purpose. I often focus on optimization close to the point where the issue is practical, and folks have often hired me specifically to make things run faster or scale better. That's been what I do for a living since around 1995, when I was hired on OpenDoc. Usually it was to make an earlier version of something actually scale, and actually perform as advertised. This is quite a bit harder than doing it the first time on purpose, since retrofitting is almost more handicap then aiming carefully at first. My last job was fun because I was allowed to make things scale and run fast the first time through. I'm basically an early optimization guy, and that doesn't work for most people. I'm not sure why it gets so many folks into the trouble it seems to. Maybe they don't decide what should go fast to begin with. (I usually have folks tell me their priorities since one can't optimize everything, at least when there are costs.) > I'm an opponent of premature optimization, and I've found that > performance that is so much worse for one application is just fine for > another application. I'd rather write the application in a > straight-forward way then use benchmarks and profilers to understand > how it behaves. I'm essentially at the opposite extreme, to the degree no one ever tries to take the position I do, which is to optimize everything the first time through, provided I don't spend more time doing it that way. I don't have trouble thinking of the fastest way to do everything, and since I always have, it's what comes naturally to me without thinking. In fact, it would take longer to write slow code, since I'd have to work at it. (Please don't read this as obnoxious or bragging. I merely try to do all operations in O(1) when possible, and then downgrade them only as much as is really necessary depending on resources and complexity.) But it's easier to take my approach when you must write your own classes for collections. If collections already exist in a language, the path of least resistance is to use what appears already, since one can do in a few minutes what would take hours to code manually. Benchmarks and profilers don't actually work as an approach to making something go fastest, even though they are good at making things go *faster*. They are good at indicating code that use more cycles than other code. But if all the code is slow, the basis of comparison can make it hard to pin the blame on any one piece, if it's everywhere. Anyway, for Chandler certainly we expect to profile code and then make parts faster when necessary. I'm sure I'll be doing a lot of that. In the case of btrees, I think of them as being specifically designed for optimal performance for a given type of problem, so it puzzles me to think of implementing them slowly at first with the intention of making them go faster later. In particular, if you write btrees with binary search, then you must throw away most of the code when another architecture is used with keys in the inner nodes, and profiling would not show you another way could be used. The beginning of an analysis of what is wrong with binary searches in btrees can be found in this old weblog entry: http://www.treedragon.com/ged/map/ti/newNov01.htm#10nov01-scaling For large populations, binary searches with increase the total number of disk seeks quite dramatically for each new object added, to the point where adding a new member will consume over a second of disk access time when several indexes are involved. And that can happen at the 100,000 member mark. The million member mark would be worse. > Obviously, you got a lot of domain-specific experience, but it's not > clear how to read your "performance is so much worse." How much > worse? For what pattern of application behavior? I wouldn't claim a lot of experience. My comment is mostly theory based in the sense that btrees are supposed to be O(logN), and lists are linear O(N), and btrees that stop balancing will degenerate into a mix of patterns with O(logN) behavior where balanced, and linear behavior in the list-like parts. If you have a huge population, say about a million, and the population goes up and down enough to unbalance the trees without actually tearing them down, you might get a big enough percentage of list-like shapes to increase time latency by at least some constant factor that is noticeable at runtime. It could be worse than constant, but it's not likely to be, since adds will tend to balance nodes by merging, so I'd expect the result to be merely sub-efficient node occupancy. Efficient occupancy for highly thrashed trees is about 85% when only two nodes get merged when balancing. I could write thrashing unit tests for ZODB btrees and then measure occupancy to satisfy my curiosity. But you're right, one should actually run tests cases to see what the difference is performance is really like for real app populations in realistic unit tests. However, it's hard to compare without pluggable substitutes. > Anyway, I've not spent much time at all working on the BTrees code. > So I'll attach a length excerpt from the Maintainer.txt that Tim > Peters wrote; it's in the source distribution. (Have you looked at > the source yet?) I started reading the Python source rather than the C source, since Python is what I need to learn, and the ZODB frameworks. It woulnd't occur to me to start with C source code for btrees in ZODB. My reading list is so large right now, and it's not prioritized very much. For example, I was reading the entire persistence-sig archive when your reply arrived. Was this better than reading ZODB source? I dunno. I'm just working away at things. :-) But now I'm getting the idea there's some documentation in the source tree, so I'll go looking for that in particular. David
|