iBoot Tasks

iBoot implements it's own task managing system different from that of the XNU kernel. It's far less complicated than the one the Kernel has implemented as there is not much complex task switching going on in iBoot.

The first task to be created is named "main", and is done so once we reach the end of main_generic() after much of the platform and system init is completed. Tasks can be created using the task_create() function, and started using task_start(). One the "main" task is completed and eventually returns, task_exit() is called. Let's take a closer look at this:

void _main_generic()
{
    ...
    v18 = _task_create("main", _main_task_, 1, 0x1C00);
    _task_start(v18);

    _task_exit(0);
}

The call to task_create() takes four parameters: the task name, the task entry point, an argument to pass to the entry point and a length for the task's stack. A newly created task structure is then returned, and passed to task_start() to begin executing from the given main_task entry point. See task_create() below:

__int64 __fastcall _task_create (__int64 name, __int64 routine, __int64 arg, __int64 stack_len)
{
    struct task t;              // Disassembles to `__int64 t;`
    void *stack_ptr;

    if (stack_len < 0x4000)      // TASK_STACK_MIN = 0x4000
        stack_len = 0x4000;

    t = malloc (sizeof(struct task));
    stack_ptr = malloc (stack_len);

    return _task_create_etc (t, name, routine, arg, stack_ptr, stack_len);
}
/*
    struct task
    *task_create (char *name, void* routine, void *arg, uint64_t stack_len)
*/

We can assume the return value is likely to be a task structure, due to _task_create_etc() being called as the return value, and no structure being created in this function. Looking at the only if-statement, the stack_len would appear to have a minimum value to which it can be set, 0x4000. We can refer to this in a cleaner way by defining it as TASK_STACK_MIN.

This function is quite short - in fact basically everything is done via a call to _task_create_etc(). The purpose of task_create is to simply allocate memory for the new task structure, and setup a stack pointer. The bulk of the work is done in _task_create_etc().

Creating Tasks

The function task_create_etc() takes, as arguments, a task structure, the name for the task, an entry point, an argument to pass to the task entry point, a stack pointer, and length for the stack. Like task_create(), it returns a task structure type.

__int64 __fastcall _task_create_etc (__int64 t, __int64 name, __int64 routine, 
__int64 arg, __int64 stack_ptr, __int64 stack_len);

We can make some assumptions about this function declaration and it's arguments. The return type is the task struct, name is a string, routine, arg and stack_ptr are all pointers, and stack_len is the size of the stack. So we can rewrite that function like so:

struct task
*_task_create_etc (char *name, void *routine, void *arg, void *stack_ptr,
uint64_t stack_len);

Now let's look at the actual function:

struct task
*_task_create_etc (char *name, void *routine, void *arg, void *stack_ptr,
uint64_t stack_len)
{
    uint32_t        i;

    memset (t, 0, sizeof(struct task));

    // This looks a bit crazy, but I go over it next.
    *(uint32_t *)t = 0x7461736B;             // t->start
    *(uint32_t *)(t + 956) = 0x74736B32;     // t->end

    *(uint64_t *)(t + 920) = stack_ptr;      // t->stack_base
    *(uint64_t *)(t + 928) = stack_len;      // t->stack_len

    strlcpy (t + 936, name, 16);   // t->name

    *(uint64_t *)(t + 904) = routine;       // t->routine
    *(uint64_t *)(t + 912) = arg;           // t->arg
    ...
}

You'll quickly notice the unusual way the t is being handled here. Take the following structure. It's a simple C struct with three elements, all of the type uint32_t:

struct example {
    uint32_t    a;
    uint32_t    b;
    uint32_t    c;
};

So, for example, if we want to access element b we would normally do this by writing example->b. However, we can also do this by writing *(uint32_t *)(example + 4), but how does this work? Well, by writing *(uint32_t *)(example) we are accessing the pointer to our example structure. Picture the structure in memory:

+------------+ 0xfffff0
| example->a |
+------------+ 0xfffff4
| example->b |
+------------+ 0xfffff8
| example->c |
+------------+ 0xfffffc

Writing *(uint32_t *)(example) will access the pointer to the structure in memory, taking our example this would be 0xfffff0 and would be equivilant to example->a as the base pointer of a structure is also the pointer to the first element in the structure. But what if we want to access an element somewhere else in the struct? Well, we can calculate it's "offset" by checking the size of each element that comes before it in the structure.

Take b. There is one element before it of the type uint32_t (bytes). Therefore to access b we would write (example + 4) being that element b is offsetted 4 bytes from the base pointer of the structure. To take another example, if we wanted to access an element say 24 bytes from the base of the structure, we'd simply write (example + 24). Simple!

Looking back to the _task_create_etc() code. Now we understand what those strange pointers are we can start to look at what they're actually assigning - and why.

Firstly, the memset() is clearing the memory allocated for the task structure by setting every byte to 0. We begin setting the values of the task structure by assigning two hexadecimal values to the start of the structure (t->start) and the end (t->end). Looking more closely at the two hex values they appear to be strings, more specifically "task" and "tsk2". Let's begin to draw a view of the structure in memory so we can rebuild the task structure:

    +------------+ 0x0          // task struct base
    |   "task"   |              // int start;
    +------------+ 0x8
    |    ...     |
    +------------+ 0x3bc        // (t + 956)
    |   "tsk2"   |              // int end;
    +------------+ 0x3c4

So we have the first element start and the last element end set to "task" and "tsk2" respectively. The purpose of this appears to be so the start and end of a task structure can be easily found in memory.

Continuing on, we are setting the stack_ptr and stack_len values at offset 920 and 928. The value of stack_ptr must be 8 bytes long because there are no other values assigned at an offset between 920 and 928, and because the platform is AArch64, we can assume addresses are uint64_t's. The stack_len is passed as a uint64_t also so we already know it's type. Let's update our structure and memory layout:

    +------------+ 0x0          // task struct base
    |   "task"   |              // int start;
    +------------+ 0x8
    |    ...     |
    +------------+ 0x398    
    | stack_ptr  |              // uint64_t *stack_ptr;
    +------------+ 0x3a0
    | stack_len  |              // uint64_t stack_len
    +------------+ 0x3a8
    |    ...     |
    +------------+ 0x3bc        // (t + 956)
    |   "tsk2"   |              // int end;
    +------------+ 0x3c4

After the stack values are set we have a call to strlcpy. This function, if we run man strlcpy, performs "size-bounded string copying and concatenation" with the first argument being the destination of the copy, the second being the source, and third being the amount of bytes to copy. In our case, we are performing strlcpy (t + 936, name, 16), so copying 16 bytes from the name variable to t + 936. It's apparant that name has a maximum size of 16 bytes, and the offset of 936 is farther from the base of the structure than stack_len so it must fall after that.

Before we re-draw the memory layout, let's also look at the next two assignments. We are assigning routine at offset 904 and arg at 912. These two are 8 bytes apart so routine is probably a pointer, like stack_ptr, so is a uint64_t. The size and type of arg is not clear, so we won't assume a data type.

Now let's redraw the layout. We have already established that name is the farthest from the base than any other element, so it will be placed after stack_len. However, arg is only 8 bytes from stack_ptr, so it will be placed just before:

    +------------+ 0x0          // task struct base
    |   "task"   |              // int start;
    +------------+ 0x8
    |    ...     |
    +------------+ 0x388
    |  routine   |              // uint64_t *routine;
    +------------+ 0x390
    |    arg     |              // void *arg;
    +------------+ 0x398    
    | stack_ptr  |              // uint64_t *stack_ptr;
    +------------+ 0x3a0
    | stack_len  |              // uint64_t stack_len;
    +------------+ 0x3a8
    |   name     |              // char name[16];
    +------------+ 0x3b8
    |    ...     |
    +------------+ 0x3bc        // (t + 956)
    |   "tsk2"   |              // int end;
    +------------+ 0x3c4
    ...
    _enter_critical_section();
    *(uint64_t *)(t + 952) = ++max_task_id;         //t->task_id
    _exit_critical_section();
    ...

Here we have another property at offset 952 and two interesting function calls surrounding it's assignment. This value is called task_id, but before we discuss what the function calls do, first a little background on how task management works in theory.

When a new task is created it is added to a global task queue. The global task queue contains all the tasks that have been created - regardless of their current state. However, a Task cannot execute by itself; instead it must be assigned CPU time by something called a Scheduler. In an operating system a Scheduler is used to calculate what tasks should be executed when, and how much CPU time they can have before we switch to another task.

Generally the scheduler can be triggered in two ways: A task can choose to voluntarily invoke the scheduler, or it can be invoked by an interrupt generated by the system timer. The ARM Developer site contains a lot of useful information on how the ARM Generic Timer works. Basically, ARM processors have a timer which can fire an interrupt once the timer completes - the amount of time that elapses before the timer will fire that interrupt can be controlled, along with whether the timer should fire interrupts at all, or just silently reset. The timer can be configured to the interval at which we want the scheduler to run, but the timer itself cannot call the scheduler. For that we need what is called an Interrupt Handler. These are bits of code that run whenever an interrupt of a certain type is fired. In this case, we can call the scheduler from the timer interrupt handler.

Now we have established that the system timer will generate an interrupt at set intervals, and that the interrupt handler will call the task scheduler, we are now presented with an issue - Critical Operations. Occasionally, a task will be executing something that cannot be interrupted, say a read/write from a file, etc. We need a way to be able to disable interruptions and rescheduling while we execute particular bits of code.

This is where _enter_critical_section() and _exit_critical_section() come in - they disable interrupts while we do something critical that cannot be interrupted by the scheduler. The two functions work by altering the irq_disable_count property (discussed later) of the current tasks task struct - the current task be set whenever the scheduler switches tasks.

To enter a "critical section" the value of the current task's irq_disable_count must be 1. If this is so interrupts are disabled for the time being. To exit a "critical section" the opposite is the case - irq_disable_count must be 0.

    ...
    *(uint64_t *)(t + 24) = 0;
    *(uint64_t *)(t + 32) = 0;
    ...

Looking at the above code we are setting two elements to 0, however they are not seperate elements. iBoot has an implementation of Doubly Linked Lists and are defined as list_node with prev and next properties also of the list_node type.

struct list_node {
    struct list_node *prev;
    struct list_node *next;
};

The list_node struct is 16 bytes, with prev and next each being 8 bytes. So accessing (t + 24) is essentially writing list_node->prev and (t + 32) being list_node->next. Therefore we are setting the previous and next elements of the list node to 0.

Now the way this list implementation works is stupidly complicated but very memory efficient, and took a lesson from @zistooshort for me to just partially understand. Apple's implementation does not include a data property for the list_node, whereas most other implementations would. Z will now explain:

@zistooshort:

    A linked list without data you say?

        struct node {
            struct node *next;      /* pointer to the next node */
            struct node *prev;      /* pointer to the previous node */
            /* but nothing for the nodes data */
        };

    The data is actually the struct containing the list node, lets say we have a struct
    like so:

        struct data {
            int some;
            int fascinating;
            int info;
            struct node node;
        };

    Now all the items in a linked list should be of the same type, and we know what type a 
    given list contains.

    Given a pointer to a list node

        struct node *interesting_stuff;

    We know `interesting_stuff` is an item in a list of `struct data`. Since the `struct node` 
    is part of `struct data` the pointer is actually equal to `&the_data->node`, if only we 
    knew where `the_data` is...

    Except we do!

    Just as we can access `the_data->node` with `the_data + 12` we can go backwards: 
    `interesting_stuff - 12`.  Et voilà, you know where `the_data` is.

    Using offsetof, which calculates the offset of a member of a given struct, we can create
    a `containerof` (variants of which are seen in the Linux kernel, Wayland and presumably 
    iBoot) to calculate this for us.

        #define containerof(ptr, type, member) \
            ((type *) ((char *) (ptr) - offsetof (type, member)))

    Allowing us to write:

        struct node *interesting_stuff = other_stuff->next;
        struct data *the_data = containerof(interesting_stuff, struct data, node);

        the_data->info = 42;

    So there you are, a linked list node without a data member. Why do this? It saves you a bit 
    of memory and potentially avoids malloc()'s.

References:


Now, this first list_node is called queue_node. I'm not entirely sure of it's use, but I at least know what it's called. But this isn't the only list_node. In fact there are three. The next being called return_waiters which is used to store the list of tasks that are waiting on this task to complete to continue. The initial values of the prev and next properties are set to itself. The disassembled code looks like so, but can be more easily understood written as list->prev = list->next = list:

    ...
    //  So, t + 880 is the base of `return_waiters`, and also the `prev` property
    //  as it comes first. t + 888 points to the `next` property, as each `list_node`
    //  element is 8 bytes.
    //
  *(uint64_t *)(t + 888) = t_struct + 880;     
  *(uint64_t *)(t + 880) = t_struct + 880;

    //  Setting two values. Initial task state, and irq_disable_count
    //
  *(uint64_t *)(t + 40) = 0;
  *(uint64_t *)(t + 44) = 1;
    ...

I hope that makes sense. Let's update the memory structure:

    +------------+ 0x0          // task struct base
    |   "task"   |              // int start;
    +------------+ 0x8
    |    ...     |
    +------------+ 0x18
    | queue_node |              // struct list_node queue_node;
    |     ->prev |  +0          // 0x18 + 0
    |     ->next |  +8          // 0x18 + 8
    |            |
    +------------+ 0x28
    |   state    |              // task_state state;
    +------------+------+ 0x2c
    | irq_disable_count |       // int irq_disable_count;
    +------------+------+ 0x34
    |    ...     |
    |            |
    +------------+---+ 0x370
    | return_waiters |              // struct list_node return_waiters;
    |     ->prev     |  +0          // 0x370 + 0
    |     ->next     |  +8          // 0x370 + 8
    |                |
    +------------+---+ 0x380
    |    ...     |
    +------------+ 0x388
    |  routine   |              // uint64_t *routine;
    +------------+ 0x390
    |    arg     |              // void *arg;
    +------------+ 0x398    
    | stack_ptr  |              // uint64_t *stack_ptr;
    +------------+ 0x3a0
    | stack_len  |              // uint64_t stack_len;
    +------------+ 0x3a8
    |   name     |              // char name[16];
    +------------+ 0x3b8
    |  task_id   |              // int task_id;
    +------------+ 0x3bc        
    |   "tsk2"   |              // int end;
    +------------+ 0x3c4

To recap on what we have added. Firstly is the queue_node list, then the state. A task must have a current state, it cannot be NULL, therefore the state 0, or TASK_INITIAL is assigned to newly created tasks. As the tasks executes the state can change, for example if the task needs to go into a sleep state this can be done by calling _task_sleep() and passing an interval to sleep for. The state would be set to TASK_SLEEPING and the scheduler will skip over that task until the interval has passed and the state is set back to TASK_RUNNING.

Next is irq_disable_count which we mentioned earlier. This value controls whether a task can be interrupted from what it's doing to be rescheduled. Then the second list, return_waiters, which is a list of tasks that are waiting on the our task to complete. Finally is task_id, a unique identifier for this task.

iBoot does not leave a task's stack empty, instead it fills the stack with a "magic" value, which turns out to be the string "kats".

while ( 1 )
  {
    v12 = (unsigned int)v12;
    if ( (unsigned int)v12 >= stack_len >> 2 )
      break;
    *(_DWORD *)(*(_QWORD *)(t_struct + 920) + 4 * v12) = *(_DWORD *)"kats";
    LODWORD(v12) = v12 + 1;
  }

This looks horrible. How can we clean this up a little? Well, to start with, it's not actually meant to be a while loop. It's a for loop. Say our stack_len value is 0x1C00 and each element on the stack is 4 bytes (which they are). So stack_len / 4 equals 1792, the same result as stack_len >> 2. Let's start to rewrite this:

    if ( (unsigned int)v12 >= stack_len / 4)

This is our exit condition. We're saying that if the value of v12 is more than or equal to stack_len divided by 4, we should exit the loop. Let's focus on that v12 value for a moment, this is the counter, what we would usually denote as i. It's incremented by one at the end of the while loop. So we have the counter, the condition and operation to be performed in the loop. Now we can re-write the above like this:

for (int v12 = 0; v12 < (stack_len / 4); i++)
    // Let's swap out _DWORD and _QWORD too!
    //*(uint32_t *)(*(uint64_t *)(t + 920) + 4 * v12) = *(uint32_t *)"kats";
    
    // Now rewrite that to remove the nasty-looking struct stuff
    *((uint32_t *) t->stack_base + (4 * v12)) = "kats"

So what is this even doing? We know that each task is allocated it's own "stack" with t->stack_base pointing to the base. Interestingly, when iBoot is creating a new task the entire stack if filled with a "magic" value initially. The actual aim isn't too clear, but it could be to prevent elements of the stack from being NULL or it could be some kind of debugging feature.

The way this is done may look complicated but it's rather simple. The code *((uint32_t *) t->stack_base + (4 * v12)) is taking the pointer to t->stack_base and incrementing it by 4 * v12, then using that value as the pointer to write the magic value "kats" too. This can be better understood by writing mem[stack_base + (v12*4)] = "kats";. The actual loop condition is not particularly interesting, we are simply looping through the size of the stack 4 bytes at a time.

Lastly, we call arch_task_create(), and increment the global task count; Writing ++MEMORY[0x18011CD44]; is essentially writing global_variable++;, this is just the way IDA has decompiled iBoot.

    ...
    _arch_task_create((uint64_t *)t);
    ++MEMORY[0x18011CD44];

    return t;
}

However, before we cover _arch_task_create(), we need to add in one more property to our task structure that we've been putting together. We'll be adding the arch property, an arch_task structure, which stores general-purpose registers, the stack pointer, link register and program counter. The aim is that the CPU state can be saved and restore when tasks are switched. Let's first take a look at the arch_task structure.

This structure is defined with six properties and two "groups". First are the "Core Registers", and the second "Floating-Point Registers". This arch_task structure is used to restore the state of the CPU when either jumping to a new task, or resuming a task that has previously been running.

For the Core registers, we have an array of 29 elements, uint64_t regs[29];, to represent the General-Purpose x0 to x28 registers. Next is uint64_t fp; to represent the Frame Pointer (or Program Counter), uint64_t lr; for the Link Register, and uint64_t sp; for the Stack Pointer. The Floating-Point Registers simply consist of uint32_t fpsr;, the Floating-Point Status Register, and uint32_t fpcr;, the Floating-Point Control Register.

struct arch_task {
    /* Core Registers */
    uint64_t        regs[29];       // x0 to x29
    uint64_t        fp;
    uint64_t        lr;
    uint64_t        sp;

    /* Floating Point Registers */
    uint32_t        fpsr;
    uint32_t        fpcr;
};

Now we have an idea of what the arch structure consists of, we can look at what the arch_task_create() function is actually doing.

__int64 __fastcall _arch_task_create (struct task *t)
{
    for (int i = 0; i < 29; i++)
        t->arch.regs[i] = 0;

    t->arch.lr = (uint32_t) &arch_task_trampoline;
    t->arch.sp = (uint32_t) t->stack_base + t->stack_len; 
    t->arch.fp = 0;
    
    return 0;
}

Firstly we're quite clearly setting initial values of 0 for the registers x0 to x28. Next is setting the value of the Link Register. This is the register where the task will resume to, so we're effectively setting the initial entry point for the task before it actually jumps to the value set in task->routine - this being the arch_task_trampoline() function. The Stack Pointer is set to the stack base, plus the size of the stack - so it points to the top. The Frame Pointer is just set to zero.

The new task has now been created and returned to the original caller. However, creating a task does not begin executing it, instead it must be started. For this, we have the _task_start() function, taking a task structure.

Now, iBoot has a global task queue, when a task is started all that is needed is to add this new task to that global queue so the scheduler can pick it up. A Task can only be "started" if it's in it's TASK_INITIAL state. If it's in any other state, we just return and continue. Otherwise, the task state is updated to TASK_READY and we enter a critical section to call insert_run_q_tail(t); on the Task. This is what updates the global run queue and the queue_node property of the Task. Here is our final task structure, reconstructed from what we have covered.

    +------------+ 0x0          // task struct base
    |   "task"   |              // int start;
    +------------+ 0x8
    |    ...     |
    +------------+ 0x18
    | queue_node |              // struct list_node queue_node;
    |     ->prev |  +0          // 0x18 + 0
    |     ->next |  +8          // 0x18 + 8
    |            |
    +------------+ 0x28
    |   state    |              // task_state state;
    +------------+------+ 0x2c
    | irq_disable_count |       // int irq_disable_count;
    +------------+------+ 0x34
    |   arch     |              // struct arch_task arch;
    +------------+ 0x344
    |    ...     |
    |            |
    +------------+---+ 0x370
    | return_waiters |              // struct list_node return_waiters;
    |     ->prev     |  +0          // 0x370 + 0
    |     ->next     |  +8          // 0x370 + 8
    |                |
    +-------------+---+ 0x380
    | return_code |                 // int return_code;
    +-------------+ 0x388
    |  routine   |              // uint64_t *routine;
    +------------+ 0x390
    |    arg     |              // void *arg;
    +------------+ 0x398    
    | stack_ptr  |              // uint64_t *stack_ptr;
    +------------+ 0x3a0
    | stack_len  |              // uint64_t stack_len;
    +------------+ 0x3a8
    |   name     |              // char name[16];
    +------------+ 0x3b8
    |  task_id   |              // int task_id;
    +------------+ 0x3bc        
    |   "tsk2"   |              // int end;
    +------------+ 0x3c4

You may, however, notice that we still have two gaps. First is a gap of 0x10 bytes between start and queue_node, and a gap of 0x2C bytes between arch and return_waiters. So we have a full picture of how the iBoot task structure works - and because the difference between the iBoot and XNU task structures are rather drastic - I'll show the complete structure now, and a brief summary of each property.

struct task {
    int start;

    struct list_node        task_list_node;        // Fills that gap of 0x10 bytes.
    struct list_node        queue_node;

    enum task_state         state;

    int                     irq_disable_count;

    struct arch_task        arch;

    struct callout          sleep_callout            // Fills the gap of 0x2C bytes.

    struct task_wait_queue  return_waiters;
    int                     return_code;

    task_routine            routine;
    int                     return_code;

    void                   *stack_base;
    size_t                  stack_len;

    char                    name[16];

    int                     task_id;

    int                     end;
};

Starting with start, this denotes the start of the task structure in memory. The value is always set to the hexadecimal value of "task". Like I've been demonstrating with the memory layout of the task structure, these bytes make it easier when trying to identify a task structure when examining memory, or if a process is trying to read all the tasks.

task_list_node is a list of all created tasks, regardless of their state. The use of queue_node is currently unknown - at least to me. state tracks the current state of the task, and it can be set to INITIAL, READY, RUNNING, BLOCKED, SLEEPING, or FINISHED as required. All of the values are defined by the task_state enum.

arch is the saved CPU state, including registers and the stack pointer. This is necessary so the CPU can be saved and resumed to whatever it was currently doing when an interrupt is recieved to switch tasks. sleep_callout is a pointer to a function that should be called should the task go into a SLEEPING state. return_waiters is a list of other tasks that are waiting on this task to complete.

The routine property points to the routine/function to call when the task starts. The task_routine type is simply a uint64_t pointer to the function provided to _task_create(). arg is an argument to pass to routine when the task is started.

stack_base is a pointer to the base of the task's allocated stack, with stack_len indicating the size of this stack.

name is a 16-byte identifiable name for the task, for example "main_task", with task_id being the unqiue numerical ID for the task. Finally, end just marks the end of the task structure when looking at it in memory.

Summary

Within iBoot there are dozens of tasks that are created and running, from USB handlers and UART to console, poweroff and, most importantly, main.

In this post we've covered creating and starting tasks, some theory about how the Task system works in iBoot, that explanation from @zistooshort about the way iBoot implements Linked-Lists, and the task structure as a whole.

In the next post I aim to cover iBoot Profiles.

Thank you for reading, as always you can reach me by email at me@h3adsh0tzz.com, and on Twitter @h3adsh0tzz.