-
Notifications
You must be signed in to change notification settings - Fork 114
Fix use after free in neighbors, spring_attachments and cell attack #397
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development
Are you sure you want to change the base?
Fix use after free in neighbors, spring_attachments and cell attack #397
Conversation
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for (int j=0; j < all_cells->size(); j++) | ||
| { | ||
| Cell* pC = (*all_cells)[j]; | ||
| if (( pC != this) && pC->phenotype.cell_interactions.pAttackTarget == this) { | ||
| pC->phenotype.cell_interactions.pAttackTarget = NULL; | ||
| } | ||
| } |
Copilot
AI
Dec 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop iterates over all_cells without synchronization, and the #pragma omp parallel for creates concurrent writes to pAttackTarget. This can cause race conditions if cells are being deleted or modified concurrently. Consider removing the parallel directive or adding proper synchronization.
| void Cell::remove_self_from_attackers( void ) | ||
| { | ||
| #pragma omp parallel for | ||
| for (int j=0; j < all_cells->size(); j++) | ||
| { | ||
| Cell* pC = (*all_cells)[j]; | ||
| if (( pC != this) && pC->phenotype.cell_interactions.pAttackTarget == this) { | ||
| pC->phenotype.cell_interactions.pAttackTarget = NULL; | ||
| } | ||
| } | ||
| } |
Copilot
AI
Dec 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This O(n) loop iterates through all cells to clean up attack targets, which can be expensive for large simulations with many cells. Consider using a reverse index (e.g., a set/list of attackers stored on each potential target cell) to make this operation O(k) where k is the number of attacking cells, which is typically much smaller than n.
| for ( int i = 0; i < all_cells->size(); i++ ) | ||
| { | ||
| Cell* pC = (*all_cells)[i]; | ||
| if (pC != this) | ||
| { | ||
| auto SearchResult = std::find( | ||
| pC->state.neighbors.begin(),pC->state.neighbors.end(),this ); | ||
| if ( SearchResult != pC->state.neighbors.end() ) | ||
| { | ||
| std::cout << "Cell " << pC->ID << " still has cell " << this->ID << " as a neighbor!" << std::endl; | ||
| pC->state.neighbors.erase( SearchResult ); | ||
| } | ||
| } | ||
| } |
Copilot
AI
Dec 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop iterates over all_cells without thread safety protections. If cell deletion can occur concurrently (e.g., from multiple threads), this could lead to iterator invalidation or accessing deallocated memory. Consider adding appropriate synchronization or documenting that delete_cell must be called from a single-threaded context.
| for ( int i = 0; i < all_cells->size(); i++ ) | ||
| { | ||
| Cell* pC = (*all_cells)[i]; | ||
| if (pC != this) | ||
| { | ||
| auto SearchResult = std::find( | ||
| pC->state.spring_attachments.begin(),pC->state.spring_attachments.end(),this ); | ||
| if ( SearchResult != pC->state.spring_attachments.end() ) | ||
| { | ||
| std::cout << "Cell " << pC->ID << " still has cell " << this->ID << " as a spring attachment!" << std::endl; | ||
| pC->state.spring_attachments.erase( SearchResult ); | ||
| } | ||
|
|
||
| } | ||
| } |
Copilot
AI
Dec 3, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This loop iterates over all_cells without thread safety protections. If cell deletion can occur concurrently (e.g., from multiple threads), this could lead to iterator invalidation or accessing deallocated memory. Consider adding appropriate synchronization or documenting that delete_cell must be called from a single-threaded context.
|
I remember being quite opposed to this PR, but I may have misunderstood what it was doing? I had thought we were looping (or even double looping?) over all cells every mechanics iteration or something like that. I see now that this is instead at every deletion, which is far more rare. So, while this is still a bandaid (we should just have all these lists be accurate all the time), this is not nearly the performance hit I recall imagining. If we don't have the collective time to track down the source of this bug, this patch should be prioritized for merging |
|
I diagrammed out how Let
It's that thread 1 seemed to get blocked so long as T2 does its attachments that seems likely to make this vanishingly rare. I think this also reveals the solution to this part of the bug giving rise to this PR: wrap the joint calls to void attach_cells_as_spring( Cell* pCell_1, Cell* pCell_2 )
{
#pragma omp critical {
pCell_1->attach_cell_as_spring(pCell_2);
pCell_2->attach_cell_as_spring(pCell_1);
}
return;
}This way, once Thread 1 decides to detach Haven't looked into neighbors to see if a similar dynamic is possible there. |
|
I've just witnessed this1 happen in very large numbers. Both on real data passed on to me from a sim and in just using the template project with 5000 cells at start. In the latter case, I found four examples in the neighbor graph at time t=60.0 min. For example, cell 36 has cell 208 in its list of neighbors, but 208 does not have 36. These are their lines: ...
36: 1366,643,208,3173,1021
...
208: 3846,3173,3108,4701,1021
...output00000001_cell_neighbor_graph.txt Footnotes
|
|
this does little to explain why at t=60 there remains inconsistencies, since this should be resolved as every mechanics voxel ends up with a cell and so sets its max interaction distance accordingly. more work required there. |
I had some crash that I wanted to fix for a while, and finally found the issues:
Here the fixes I'm proposing for now:
Clearly, this can be improved, but it fixes the present issues.