To-Do List for LKRhash ====================== TESTING ======= * Write a C test, to directly test the C API. HashTest half does it, as it can use the public API and TypedHashTable. * Build a publicly distributable test program, as a sample for customers. E.g., a dictionary component for ASP. * hashtest: modify so that each thread can work on a completely separate hashtable, instead of sharing a global hashtable. This will allow us to discover the maximum possible scaling, which is probably less than the `ideal' scaling, #CPUs * 1-thread-perf. * hashtest\kernel: split into driver and usermode test program. Usermode test program should take care of parsing the arguments and loading the test data into memory, then use an ioctl to pass it down to the kernel driver. The only code in the kernel driver,apart from LKRhash itself, should be the goo to crack the ioctl and DriverEntry. Usermode should print results. * Better tests for ApplyIf family KERNEL-MODE =========== * kernel locks: should they block at all? consider implications if running on some usermode program's thread. * Fix global objects in driver, including the global list of tables * implement kernel-mode version of !lkrhash * Think about running at DISPATCH_LEVEL instead of PASSIVE_LEVEL. Which functions should be pageable vs. non-pageable? Memory allocators? Lock types? * Use ? * Memory allocation pool is a parameter for LKRhash public constructors, but it's ignored. Need virtual base for allocators? * DONE: Control debug spew: provide some way of setting g_fDebugOutputEnabled LOCK IMPLEMENTATION =================== * WriteLock: make all return a value instead of `void' that's passed into WriteUnlock. Ditto for ReadLock, etc. Needed for CKSpinLock and OldIrql. * Put delays into the inner loop of the spins, to see if that reduces cache sloshing * Enable per-class instrumentation for locks, as opposed to the current all-or-nothing system for per-lock instrumentation. Mostly present already, just needs a little factoring. * Locks code. Move all member function implementations and enums into locks.cpp. Locks.h should be opaque declarations only. * Rewrite a few key functions, such as CSmallSpinLock::WriteLock or CReaderWriterLock3::ReadOrWriteLock, in x86 assembler * Build locks as a separate statically linked library * Refcount the initialization/cleanup routines? * Move CKSpinLock, CEResource, and CFastMutex into * DONE: Sprinkle KeEnterCriticalRegion (and KeLeaveCriticalRegion) in the various locks, to prevent APCs being delivered that might suspend the thread that holds the lock. MISCELLANEOUS ============= * Reduce the three versions of LKRhash in IIS6 to just this one. * Currently, we key the number of subtables off LK_TABLESIZE times a function of the number of CPUs. However, we only need a lot of subtables under two circumstances: (a) many millions of elements (esp. on Win64, where total number of elements might exceed 2^32), or (b) high contention. There isn't necessarily a correlation between large table size and high contention. With the multi-reader locks, high contention only arises if there are a lot of insertions/deletions. If the table is not modified much after it's built, contention shouldn't be an issue and there's little advantage to having a lot of subtables. * Provide a DeleteKey variant that returns a pointer to the record that is being removed from the table. * Make AddRefRecord return the new reference count. * Inline the array of pointers to subtables within CLKRHashTable, instead of dynamically allocating it. * Inline _FindBucket by hand into Delete(Key|Record), InsertRecord, and Find(Key|Record). * Factor out memory stuff into LKR-mem.cpp * Add some instrumentation: #allocs, #expands, #contracts, etc. * Add some flags to LKR_Initialize: default size, output tracing, etc * Should ApplyIf(LKP_DELETE) call the Action function before deleting? Or add LKP_PERFORM_DELETE[_STOP] * Add debug print routines for the other enumerations and for lock state to LKRhash. * Experiment with having no bucket locks whatsoever, just subtable locks. This should make operations a little faster, but presumably will hurt multiprocessor scalability * Implement fMultiKeys to provide support for multiple, identical keys. Needed for EqualRange, hash_multiset, and hash_multimap. See CLKRLinearHashTable::_InsertRecord for details on what needs to be changed. * Provide implementations of the STL collection classes: hash_map, hash_set, hash_multimap, and hash_multiset. Must provide full implementation of STL iterator semantics (pair) * const_iterators for STL-style iterators * consider providing some kind of implicit locking with STL-style iterators * Add version.subversion number to CLKRLinearHashTable * Build lkrhash as a separate statically linked library * Better Statistics: #buckets, density, by subtable * Experiment with new hash function from Paul Larson. * Public API in LKR-hash.h: remove dependency on irtlmisc.h and irtldbg.h * Use IsBadCodePtr to validate callback functions * Make exception-safe. What happens if callback routines (AddRefRecord/ExtractKeys or ApplyIf) access violate (throw an SEH exception) or throw a C++ exception? Table and bucket locks won't be released, state variables may be corrupted, etc. LKRhash code should never throw any kind of exception * Add some kind of auto object for readlocking or writelocking a table, so that the table automatically gets unlocked by auto-obj's destructor. * Use auto_ptrs. * Port to C# (Chris Tracy has started on this). Andre Rosenthal has started a port to Managed C++ * Make LKRhash available as a static library as well as a DLL * DONE: break apart lkrhash.cpp: iterators, applyif, stats, etc * DONE: Always step forward through all subtables, to avoid possible deadlock, when acquiring subtable locks * DONE: Test new contraction algorithm. * DONE: sublocks for ApplyIf * DONE: Provide a C API wrapper