Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel/Memory: Various fixes for anonymous mmaps #24639

Merged
merged 7 commits into from
Jul 12, 2024

Conversation

brody-qq
Copy link
Contributor

@brody-qq brody-qq commented Jul 1, 2024

This PR fixes the following problems related to anonymous mmaps:

This PR also adds more test cases for anonymous mmaps.

@BuggieBot
Copy link
Member

Hello!

One or more of the commit messages in this PR do not match the SerenityOS code submission policy, please check the lint_commits CI job for more details on which commits were flagged and why.
Please do not close this PR and open another, instead modify your commit message(s) with git commit --amend and force push those changes to update this PR.

@github-actions github-actions bot added the 👀 pr-needs-review PR needs review from a maintainer or community member label Jul 1, 2024
@brody-qq brody-qq force-pushed the anon-mmap-fixes branch 2 times, most recently from 2e715f6 to ded0c9c Compare July 1, 2024 01:38
@nico
Copy link
Contributor

nico commented Jul 1, 2024

@spholz You know this code a bit, right? Want to look at this? :)

Copy link
Contributor

@Hendiadyoin1 Hendiadyoin1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from one nit
LGMT

Kernel/Memory/Region.cpp Outdated Show resolved Hide resolved
@brody-qq brody-qq force-pushed the anon-mmap-fixes branch 2 times, most recently from 38dd11e to 3dc3550 Compare July 1, 2024 16:05
Kernel/Memory/Region.cpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@spholz spholz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small question, but otherwise LGTM

{
for (size_t i = 0; i < page_count(); ++i) {
auto& page = physical_pages()[i];
bool should_cow = !page->is_shared_zero_page() && !page->is_lazy_committed_page();
Copy link
Collaborator

@spholz spholz Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnonymousVMObject::should_cow returns true if the physical page is the shared_zero_page or the lazy_committed_page, so I am unsure if we have other code expecting that pages referencing these physical pages are CoW (it does kind of make sense to make these CoW, since when writing to those we have to zero them and therefore essentially copy the contents).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And AddressSpace::try_allocate_split_region also sets the CoW bit for all pages where Region::should_cow (which uses `AnonymousVMObjet::should_cow) returns true.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah oops, initially an AnonymousVMObject will of course be backed by the actual committed pages or references to either the lazy committed page or the shared zero page.
And those don't have a cow map allocated, so none of the pages are cow.

But AddressSpace::try_allocate_split_region should I think still probably not set the cow bit for shared zero pages and lazy committed pages.

Copy link
Contributor Author

@brody-qq brody-qq Jul 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, I completely missed the set_should_cow call in AddressSpace::try_allocate_split_region.

After taking a closer look at AddressSpace::try_allocate_split_region, I think the best approach is to just remove the loop that calls set_should_cow.
Here is the diff:

diff --git a/Kernel/Memory/AddressSpace.cpp b/Kernel/Memory/AddressSpace.cpp
index 9ea53b88a1..e96406e8b1 100644
--- a/Kernel/Memory/AddressSpace.cpp
+++ b/Kernel/Memory/AddressSpace.cpp
@@ -148,11 +148,6 @@ ErrorOr<Region*> AddressSpace::try_allocate_split_region(Region const& source_re
     new_region->set_syscall_region(source_region.is_syscall_region());
     new_region->set_mmap(source_region.is_mmap(), source_region.mmapped_from_readable(), source_region.mmapped_from_writable());
     new_region->set_stack(source_region.is_stack());
-    size_t page_offset_in_source_region = (offset_in_vmobject - source_region.offset_in_vmobject()) / PAGE_SIZE;
-    for (size_t i = 0; i < new_region->page_count(); ++i) {
-        if (source_region.should_cow(page_offset_in_source_region + i))
-            TRY(new_region->set_should_cow(i, true));
-    }
     TRY(m_region_tree.place_specifically(*new_region, range));
     return new_region.leak_ptr();
 }

I think this works because if you look at this line of code, it sets new_region to have the same m_vmobject as source_region before the loop runs:

auto new_region = TRY(Region::create_unplaced(
        source_region.vmobject(), offset_in_vmobject, move(region_name), source_region.access(), source_region.is_cacheable() ? Region::Cacheable::Yes : Region::Cacheable::No, source_region.is_shared()));

So that set_should_cow loop really isn't doing anything, since source_region and new_region reference the same vmobject and cow map.

Does this change make sense? (I hope I'm not missing something obvious here)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also ProcFS is the only other place I can find that may be impacted by not marking lazy/zero pages in the cow map.

Particularly this line for setting cow_pages:

           TRY(region_object.add("cow_pages"sv, region.cow_pages()));

So if you go with my changes, then ProcFS will report a smaller number than before for cow_pages, since it isn't counting the lazy/zero pages anymore.

I don't know enough about how this cow_pages number from ProcFS gets used to know if this is a problem, does anyone have any objections to this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this loop from AddressSpace::try_allocate_split_region seems fine.
This code was added in f4e7aec. Back then the m_cow_map was a member of Region:

mutable OwnPtr<Bitmap> m_cow_map;

So this loop should indeed be unnecessary now, since the m_cow_map is now owned by AnonymousVMObjects.

It may also be nice if we could have a test (in munmap-multi-region-unmapping.cpp maybe?) for splitting a region that is backed by an AnonymousVMObject with pages marked as CoW.

And about procfs potentially showing a different cow_pages count: This should be fine, we only use that procfs parameter in the pmap utility and the "Memory map" tab in SystemMonitor to display the number of CoW pages.

Copy link
Contributor Author

@brody-qq brody-qq Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new commit that removes the loop from AddressSpace::try_allocate_split_region and I added some new test code to munmap-multi-region-unmapping.cpp. Can you let me know if this test code is what you had in mind?

Edit: The new test code is part of the Tests/Kernel: Add new tests for anonymous mmaps commit

Copy link
Collaborator

@spholz spholz Jul 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that test triggers the try_allocate_split_region code path.
try_allocate_split_region seems to get called on munmaps that don't cover an entire mapping and on mmaps with MAP_FIXED that overlap, but your mmaps don't seem to overlap.

I think the test should do something like this:

  • mmap more than 1 page
  • write some data into the mapping
  • fork
  • in the child:
    • munmap part of the mapping
    • write to the still mapped part
  • in the parent:
    • wait for the child
    • verify that the original data is still there

Also, munmap-multi-region-unmapping.cpp is a legacy non-LibTest test. As you may have noticed, those are skipped by default. I am unsure if it is a good idea to add new tests to legacy test files even if they would be the right place for this test.
Sorry, I should have noticed this earlier (especially based on the filename).

(munmap-multi-region-unmapping.cpp seems like a test which we should convert to LibTest, you don't need to do that in this pr though)

Copy link
Collaborator

@spholz spholz Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't think that not having an accompanying test should be a blocker (I am not a maintainer). I just thought it might be nice to test this behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, I misunderstood your original comment. I put the test case you described in TestAnonymousMmaps.cpp and got rid of the test I added to the legacy file.

@brody-qq brody-qq force-pushed the anon-mmap-fixes branch 8 times, most recently from e3379c0 to a3abf9f Compare July 7, 2024 00:19
@Hendiadyoin1
Copy link
Contributor

Seems like one of the other kernel PRs created some conflicts

@Hendiadyoin1 Hendiadyoin1 added the ⚠️ pr-has-conflicts PR has merge conflicts and needs to be rebased label Jul 8, 2024
AddressSpace::try_allocate_split_region() was updating the cow map of
new_region based on the cow map of source_region.

The problem is that both new_region and source_region reference the
same vmobject and the same cow map, so these cow map updates didn't
actually change anything.

This commit:
* removes the cow map updates from try_allocate_split_region()
* removes Region::set_should_cow() since it is no longer used
After a fork(), page faults on anonymous mmaps can cause a redundant
page fault to occur.

This happens because VMObjects for anonymous mmaps are initially filled
with references to the lazy_committed_page or shared_zero_page. If there
is a fork, VMObject::try_clone() is called and all pages of the VMObject
are marked as cow (via the m_cow_map).

Page faults on a zero/lazy page are handled by handle_zero_fault().
handle_zero_fault() does not update m_cow_map, so if the page was marked
cow before the fault, it will still be marked cow after the fault. This
causes a second (redundant) page fault when the CPU retries the write.

This commit removes the redundant page fault by not marking zero/lazy
pages as cow in m_cow_map.
This commit makes the following minor changes to handle_zero_fault():
* cleans up a call to static_cast(), replacing it with a reference (a
  future commit will also use this reference).
* replaces a call to vmobject() with the new reference mentioned above.
* moves the definition of already_handled to inside the block where
  already_handled is used.
Writes to a MAP_SHARED | MAP_ANONYMOUS mmap region were not visible to
other processes sharing the mmap region. This was happening because the
page fault handler was not remapping the VMObject's m_regions after
allocating a new page.

This commit fixes the problem by calling remap_regions() after assigning
a new page to the VMObject in the page fault handler. This remapping
only occurs for shared Regions.
This commit introduces VMObject::remap_regions_single_page(). This
method remaps a single page in all regions associated with a VMObject.
This is intended to be a more efficient replacement for remap_regions()
in cases where only a single page needs to be remapped.

This commit also updates the cow page fault handling code to use this
new method.
This adds the following tests:
* test that writes to a MAP_SHARED | MAP_ANONYMOUS mmap region are
  visible in processes sharing the region.
* test that writes to a MAP_PRIVATE | MAP_ANONYMOUS mmap region are not
  visible in processes sharing the region.
* test that multi-region mmaps backed by cow pages work correctly.
AnonymousVMObject::try_clone() computed how many shared cow pages to
commit by counting all VMObject pages that were not shared_zero_pages.

This means that lazy_committed_pages were also being included in the
count. This is a problem because the page fault handling code for
lazy_committed_pages does not allocate from
m_shared_committed_cow_pages. So more pages than necessary were being
committed.

This fixes this overcommitting problem by skipping lazy_committed_pages
when counting how many pages to commit.
@brody-qq
Copy link
Contributor Author

Seems like one of the other kernel PRs created some conflicts

Fixed, it was some new code in Region.cpp that didn't impact any functionality of this PR.

@nico nico merged commit 2a164dc into SerenityOS:master Jul 12, 2024
12 checks passed
@github-actions github-actions bot removed the 👀 pr-needs-review PR needs review from a maintainer or community member label Jul 12, 2024
@nico
Copy link
Contributor

nico commented Jul 12, 2024

Thanks!

Since you're in this part of the code: If you're looking for things to do, it'd be nice if posix_spawn() could become a syscall instead of being implemented in userspace. We could save the time needed to copy page metadata then. Most processes are launched using posix_spawn() in serenity, so that might help with performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚠️ pr-has-conflicts PR has merge conflicts and needs to be rebased
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants