Lock hash tables during augmented assignment#4017
Lock hash tables during augmented assignment#4017d-torrance wants to merge 2 commits intoMacaulay2:developmentfrom
Conversation
|
I feel like it would be cleaner to handle The think I can't remember is if we could have a circumstance where "#" or "." is overridden with something weird. |
|
I don't think # or . can be overridden at top-level (yet!) but I consider this to be higher priority than allowing top-level to override. This is extremely troublesome for me right now so I'll take any solution that works! |
|
. and # can't be overridden. I used to not like that, but now I kind of do. We have other operators (_, and I use @@ in the Python package for .) that we can use, keeping these two nice and low level for getting at the underlying list or hash table. I like Jay's idea. I'll play around with it soon |
|
(I deeply dislike @@ btw, you should just move the relevant types to the interpreter and use . I think) |
|
Also, I didn't mention it earlier, you have to make sure to evaluate the right hand side before locking on the table. This is a bit subtle about order of operations, but if the right hand side also uses the same hash table (which isn't crazy) then you would have the same recursive lock issue. I think this makes This is a compromise that I think is fine, but I want to point this out. In theory a custom operator can make this even worse if it looks into the table somehow, but that shouldn't happen with sensible operators. |
|
@d-torrance I can't help until after the semester but any luck with this? |
|
I haven't taken a stab at it yet, but hopefully soon! |
|
@jkyang92 in principle I agree with everything in your comment, but I struggle to see a clean way to implement it. Do you have a any suggestions? |
|
Inside of the In the hash table case, you can then:
The only danger with this is if the operator has a user defined method, it could in theory relock the hash table in step 4. You could also add a bit of code before step 1 to take a read lock on the hash table and check if the entry even exists before proceeding, I don't know that this expense is worth it. |
|
Another approach that would have fewer worries about deadlock/relocking would be to "black hole" the entry in the hash table when we are modifying it. This entails putting a placeholder entry into the hash table that indicates "this entry is being modified". The black hole then should indicate which thread is currently modifying it, and have appropriate condition variables/mutexes such that other threads can wait on it. (If creating one per black hole seems expensive you can do one per thread). It's then an error (which can be sent back to the user) if you try to modify (but not read?) an entry which the current thread has black holed, but any other thread will just wait on the black hole. To avoid deadlock, you'd need special handling of waiting on black holed entries while having created black holed entries, but as long as you allow reads through the black hole, this is not a huge issue. The advantages is that this allows not holding the lock while doing the actual operation and only when actually writing the entry back to the hash table (you still have to take a write lock to add the black hole). There are complicated questions about what sequencing of operations this corresponds to, and annoyingly, it requires a check for black holes at every write into the hash table. |
|
This is a funny compiler warning: [45/59] Building C object Macaulay2/bin/CMakeFiles/M2-binary.dir/startup.c.o
/home/mahrud/Projects/M2/research/M2/BUILD/build/Macaulay2/bin/startup.c:196:58: warning: trigraph ‘??=’ ignored, use ‘-trigraphs’ to enable [-Wtrigraphs]
196 | " setAttribute = (val,attr,x) -> ( Attributes#val ??= new MutableHashTable )#attr = x;\n"Shouldn't the compiler know to ignore a hardcoded string? ps: I actually don't think it's related to this PR but somehow this is the first I'm seeing this. |
|
@mahrud It occurs very early in the processing, even before tokenization. so the compiler can't tell if it's in a string. If we want to suppress the warning, we can split the string into two (using the fact that neighboring string literals get concatenated). Also, I checked and trigraphs have been removed in C++17 and C23. |
|
We are compiling with c++17, I thought? |
|
It's because its a C file and we're certainly not compiling with C23. |
This is the beginnings of an attempt at fixing #3928.
Here's the idea: We'd like something like
x.a += 1to be atomic whenxis a mutable hash table. So at the beginning of augmented assignment, we look to see if the left hand side is binary code with.or#as its binary operator, and then check if the left hand side of that is a hash table. If so, go ahead and lock it, and stuff the hash table in a variable so we know how to unlock it when we're done.The big problem, however, is that we currently implement hash table mutexes with a rwlock, which isn't recursive. In particular, if we lock the mutex, we can't lock it again in the same thread or we deadlock. But we're definitely going to lock it at least one more time when we call
lookupto get the value at the given key.So in the current draft, M2 immediately aborts because it runs into this problem on startup.
A possible idea would be to switch from rwlock's to recursive mutexes for hash tables. But hash tables are so fundamental in M2 that I'd rather wait until after the release to try this to give us plenty of time to test any repercussions.
Cc: @mahrud