Skip to content

Runqueue Balancing

Sukant Pal edited this page Jan 2, 2018 · 6 revisions

Peer-to-Peer Balancing

The default (and currently only) load balancing technique used by the kernel is peer-to-peer balancing, as named by Shukant Pal. It is used by the RoundRobin scheduler and in the recent future will be used for all upcoming schedulers.

At each timer interval, the scheduler calls RunqueueBalancer::balanceWork(ROUND_ROBIN or CFS or anyother).

`

          void RunqueueBalancer::balanceWork(ScheduleClass cls)
          {

  	        Domain *rqholder = GetProcessorById(PROCESSOR_ID)->domlink;

  	        while(rqholder->parent != NULL)
  	        {
  		        if(getSystemTime() > rqholder->taskInfo[cls].balanceDelta)
		        balanceWork(cls, rqholder);
  		        else
  			        Dbg("called");// for debugging purpose (only)
  		        rqholder = rqholder->parent;
  	        }
          }

`

This function iterates over the system topology until it reaches a child of SYSTEM (top-most domain). If a domain is due for balancing it calls balanceWork(cls, rqholder) which will semi-balance the due-for-balancing domain. `

    void RunqueueBalancer::balanceWork(ScheduleClass cls, Domain *client)
    {
        Domain *busiest = DomainBinding::findBusiestGroup(cls, client);

        if(!busiest)
        {
	        return;// TODO: Implement shutting off the CPUs
        }
        else if(busiest != client)
        {
	        // send a renounce request, and later we will get a accept request
	        Processor *srcCPU = DomainBinding::getBusiest(cls, busiest);
	        Processor *dstCPU = DomainBinding::getIdlest(cls, client);

	        Renounce *req = new(tRunqueueBalancer_Renounce) Renounce(cls, *busiest, *client, *srcCPU, *dstCPU);
	        CPUDriver::writeRequest(*req, srcCPU);
        }

        SpinLock(&client->queueLock);
        client->taskInfo[cls].balanceDelta += client->level * client->level * 64;
        SpinUnlock(&client->queueLock);
     }

` This function doesn't actually balance the two domains by transferring tasks to or from this-cpu. It will transfer tasks from the idlest to the busiest processor to the destination domain from the source domain. It will also delay the next balance routine.

It works by sending a RunqueueBalancer::Renounce request to the busy processor. That will make it renounce some tasks and send back a reply of type RunqueueBalancer::Accept with a list of transferred tasks. With the help of IPIRequest the kernel makes use of lockless access to processor runqueues. This is done only because the owner CPU accesses only its own runqueue.

`

  switch(req->type)
  {
	case AcceptTasks:
		RunqueueBalancer::Accept *acc = (RunqueueBalancer::Accept*) req;
		tcpu->lschedTable[acc->type]->recieve((KTask*) acc->taskList.lMain, (KTask*) acc->taskList.lMain->last,
							acc->taskList.count);
		break;
	case RenounceTasks:
		DbgLine("EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE");
		RunqueueBalancer::Renounce *ren = (RunqueueBalancer::Renounce*) req;
		ScheduleClass type = ren->taskType;

		unsigned long loadDelta = ren->src.lschedTable[type]->load - ren->dst.lschedTable[type]->load;
		unsigned long deltaFrac = ren->donor.level + 1;

		loadDelta *= deltaFrac - 1;
		loadDelta /= deltaFrac;

		// for cpus in same domain -> transfer 1/2 of load-delta (making same load)
		// for cpus in different domains -> transfer 2/3, 3/4, 4/5, etc.

		RunqueueBalancer::Accept *reply = new(tRunqueueBalancer_Accept)
				RunqueueBalancer::Accept(type, ren->donor, ren->taker);
		ren->src.lschedTable[type]->send(&ren->dst, reply->taskList, loadDelta);
		CPUDriver::writeRequest(*reply, &ren->src);

		kobj_free((kobj*) ren, tRunqueueBalancer_Renounce);

		DbgLine("renounce not imple yet");
		break;
	default:
		return;
   }

`

This code is taken from the Executable_ProcessorBinding_IPIRequest_Handler function which handles request from another cpu to this cpu. You can see how the renounce & accept handler cases work.

In the renounce case, the processor will sent larger amounts of tasks if the domains participating are at a higher level.

Clone this wiki locally