Monday, March 21, 2011

Page Reclamation III (Reverse Mapping)

LWN article (anon_mm) (anon_vma) (comparison)

from PLKA:
page structure contains a single element to implement reverse mapping { atomic_t _mapcount; }.
two other data structures are needed: priority search tree for file address space, and a linked list for anon address space.
region descriptor (vm_area_struct) has all info to generate reverse mapping: union shared, anon_vma_node, anon_vma
this is called object-based reverse mapping because a page is not directly associated with the process. rather a memory regions are associated with the page (, therefore, the processes  too)

anon mapping:
page->mapping: points to anon vma inside memory region object desc (vma)
page->index: relative position of the page in the memory region
last bit of page->mapping will be 1 for anon mapping (PAGE_MAPPING_ANON) or 0 for file mapping.
adding into any mapping increments page->_mapcount count.

mapcount and activity are not synonymous. because mapcount is static, whereas activity means page is being actively used right now. activity means the _PAGE_ACCESSED bit is set in the page table entry for that page in a memory regions of a process. so, during page_referenced() function, we need to get each memory region for that page, get the page table entry, check _PAGE_ACCESSED bit, clear it if it is set. interestingly, page_referenced means the number of _PAGE_ACCESSED bits set for that particular page for all the processes (memory regions) that are using that page.

from ULK3:
page structures stores backward link to memory region desc. a mem reg des contains PGD which can be used to find the PTE for that page. thus, we can get the list of PTEs from a given page structure easily. to find the number of places from where this page is mapped, we can use the page->_mapcount field. to see if the mapping is file or anon, we have to look into the last bit of page->mapping. page->index contains the relative position of that page from the beginning of the mem reg.
[note: a page in the buddy should have a mapcount of -1., non-shared page mapcount 0, shared page mapcount 1+]

now, page->mapping links data structures that connects the memory regions for this page.
page->mapping = NULL, this page is in swap cache.
page->mapping = anon_vma if last bit is 1 (anon mapping)
page->mapping = address_space if last bit is 0 (file mapping)

anon memory desc:
when kernel assigns the first page to an anonymous memory region, it allocates anon_vma data structure which has a lock and a listhead.
memory regions are kept in that list. mem_reg->anon_vma = anon_vma, mem_reg->anon_vma_node maintains the list.
notice there is a lock involved here, so think about it when considering scalability with too many shared (anonymous) pages.

[note: vma->vm_pgoff = offset of memory region in the mapped file, page->index = offset of the page in the memory region]

to find the PTE, we must need the actual linear address of the page (in that memory region). it is very important. if somehow we can't figure out the linear address of the page for a memory region, we need to search the all the PTEs in that particular memory region for that page, this happens for non-linear memory mapping. For a particular memory region, we can get the PTEs because we have the beginning and the ending addresses. So, it is easy to do a query into the page table structures to view its current state.

A page might have different linear addresses depending on the memory region it was mapped to. To find PTE, we need PGD and the linear address. whenever thinking about page mapped in memory, think about both linear address and physical address.

[pitfalls]
mremap() may crash the party by directly modifying page table entries.
if PTE says the page is accessed, then unmapping won't take place as that page is considered in-use.
locked/reserved memory regions can also nullify remapping effort.

file address space desc:








No comments:

Post a Comment