Skynet Development Guide
更新时间: 2024/12/12
在Gitcode上查看源码

Skynet is an open-source lightweight game server framework that implements the actor model and related scaffolding in the C language. It provides a mature microservice scheduling framework and integrates the Lua virtual machine (VM) as the specific implementation of the actor model. For details, see the Skynet GitHub official website.

openUBMC uses only some features of Skynet. This document describes the functions related to openUBMC to guide development.

Components and Services

In Skynet, a service is the smallest independent running unit. Each service has an independent Lua virtual machine and runs in only one thread at a time. Services interact with each other through asynchronous messages.

In openUBMC, you can create one or more Skynet services based on the service functions required by a component. Most components require only one service to complete service operations. Therefore, when writing a component, you do not need to consider the risks of parallel data operations. Service logic within a component is atomic.

Creating a Service

In Skynet, you can create a service using skynet.unique_service or register a service using skynet.register. In an openUBMC component, the src/service/main.lua file controls the creation of component services.

lua
local skynet = require 'skynet'
local logging = require 'mc.logging'
local my_app_app = require 'my_app_app'

skynet.start(function()
    skynet.unique_service("sd_bus")
    skynet.register("my_app")
    local ok, err = pcall(my_app_app.new)
    if not ok then
        logging:error('my_app start failed, err: %s', err)
    end
end)

In the preceding example, the system creates an sd_bus service and registers the current service name as my_app. It also creates a my_app_app object, which is the component object based on the micro-component framework.

The input parameter of skynet.unique_service is the unique name of the service. If a service with the same name already exists in the same process, Skynet will not start the service again. Therefore, although each component needs to create an sd_bus service, only one instance exists in the process.

Coroutines

A coroutine is similar to a thread (in the sense of multithreading): it is a line of execution, with its own stack, its own local variables, and its own instruction pointer; it shares global variables and mostly anything else with other coroutines. The main difference between threads and coroutines is that a multithreaded program runs several threads in parallel, while coroutines are collaborative: at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended.

Excerpt from Programming in Lua, Fourth Edition

Some microservice concepts are introduced in openUBMC: coroutines and concurrency.

In the micro-component architecture, the scope of a service is relatively small, and the number of services grows as the service expands. If each microservice has a process or thread, system resources will become a bottleneck. Most BMC services can be decomposed into small independent units that work only when triggered by external factors such as devices or network management systems. Therefore, we use coroutines to rearrange services and use concurrency to handle future scenarios where the number of services continues to grow.

Lua provides excellent support for coroutines. Creating and switching coroutines incurs low overhead in most scenarios. However, if coroutines are improperly designed, frequent switches can lead to significant performance waste due to context switching.

Note:

Coroutine switching is supported only in the Lua part.

Code snippet:

lua
local skynet = require 'skynet'

print("before skynet fork")
skynet.fork(function()
    print("inside coroutine")
end)
print("after skynet fork")

Results:

bash
> before skynet fork
> after skynet fork
> inside coroutine

Note:

skynet.fork can be misleading. Here, fork does not create a thread, but a coroutine.

The interface for creating coroutines in Skynet is skynet.fork, and the input parameter is a function. In openUBMC, the running scope of a coroutine is to run a complete function. Therefore, the coroutine function runs only after the current function finishes execution. In the preceding example, inside coroutine is printed after after skynet fork, not before.

Resident Coroutines

In openUBMC, it is often necessary to create infinite loops to read data from hardware periodically. In such scenarios, coroutines can greatly simplify code development.

lua
local skynet = require 'skynet'

function main()
    some_logic()
    skynet.fork(function()
        while true do
            some_function_to_get_hardware_reading()
            skynet.sleep(100)
        end
    end)
    other_logic()
end

Note:

The unit of skynet.sleep is 10 milliseconds. Therefore, skynet.sleep(100) actually means 10 × 100 milliseconds, which is 1 second.

In the preceding example, a coroutine is created with an infinite loop in its function. Therefore, this coroutine never finishes execution.

Each time some_function_to_get_hardware_reading() finishes execution, the coroutine is suspended for 1 second and then executed again.

During the suspension, the component can still execute other service logic. After 1 second, it executes some_function_to_get_hardware_reading() again.

Coroutine Exit

A coroutine exits only after the function finishes execution. Therefore, the exit of a coroutine depends on whether the function can complete its execution, whether it finishes normally or exits due to an exception.

lua
skynet.fork(function()
    local retry = 0
    while retry < 3 do
        if hardware_not_exist() then
            error("hardware missing")
        end
        local ok = set_value_to_hardware()
        if ok then
            print("set success!")
            return
        end
        retry = retry + 1
        skynet.sleep(100)
    end
    print("set fail after 3 times!")
end)

The preceding example is a common case for setting hardware values with three retries.

In any scenario, the function itself is designed to finish execution. Therefore, the coroutine will finish execution and will not exist permanently.

This type of exit is called active exit of a coroutine, where the caller of the exit is the coroutine itself.


However, in some scenarios where a coroutine cannot exit internally, you can use skynet.killthread to end it.

lua
local co = skynet.fork(function()
    while true do
        print("long live coroutine!")
        skynet.sleep(100)
    end
end)

function stop_coroutine()
    skynet.killthread(co)
end

When a coroutine is created, Skynet returns a coroutine tag that you can use to force its termination when necessary.

This type of exit is called passive exit of a coroutine, where the caller of the exit is not the coroutine itself but another coroutine. If the coroutine that calls stop_coroutine is never triggered or its turn has not come yet, the resident coroutine will continue to execute.

Lua SDK Task Mechanism

Although Skynet makes it easy to create coroutines, it requires developers to learn Skynet-related knowledge. To simplify developers' dependency on Skynet, openUBMC encapsulates coroutines and provides the mc.tasks mechanism.

lua
local tasks = require 'mc.tasks'

local t = tasks.get_instance():new_task("unique task id")
t:loop(function(task) 
    if self.TemperatureCelsius > 120 then
        task:stop() -- The task provides a stop operation internally.
    end
    self.TemperatureCelsius = self.TemperatureCelsius + 1
end)
t:set_timeout_ms(5000) -- Set the polling cycle of the resident coroutine.

function stop_task()
    tasks.get_instance():new_task():once(function()
        tasks:sleep_ms(1000) -- Suspend the coroutine for 1s.
        t:stop()
    end
end

When creating a task, you must pass a unique task ID. If the ID is a duplicate, the system will not create a new task, preventing a large number of duplicate coroutine tasks in exception scenarios. If no unique ID is set, no check is performed.

A task provides polling or one-off mechanisms. For a polling mechanism, you must set the polling cycle in milliseconds.

tasks also provides the capability to suspend the current coroutine. The unit of tasks:sleep_ms is also milliseconds.

Within a task coroutine, you can obtain its own handle and end the coroutine based on service requirements.

Coroutine Scheduling and Orchestration

In an operating system (OS), a thread is the smallest unit of scheduling. Since the granularity of a coroutine is smaller than that of a thread, coroutine scheduling cannot depend on the OS. Instead, developers must schedule coroutines. Based on their own service scenarios and requirements, developers can try to fill each time slice provided by the OS with code that needs to run, maximizing CPU utilization.

The concept of coroutine scheduling originated from I/O scenarios. I/O scenarios involve a large amount of waiting time, whether it is the processing delay of the caller, network transmission delay, or OS processing delay, all of which can be utilized.

In openUBMC, all RPC, database read/write, and network read/write operations are asynchronous. This means that when these calls are waiting for a response, the current coroutine is suspended, and other coroutines are executed.

lua
skynet.fork(function()
    print("function 1 rpc start!")
    rpc_takes_10ms_to_return()
    print("function 1 rpc end!")
end)
skynet.fork(function()
    print("function 2 rpc start!")
    rpc_taks_1ms_to_return()
    print("function 2 rpc end!")
end)
bash
function 1 rpc start!
function 2 rpc start!
function 2 rpc end!
function 1 rpc end!

In the preceding example, both coroutines call rpc. When function 1 calls rpc, its coroutine is suspended, and the function 2 coroutine starts executing. Then, when function 2 calls rpc, its coroutine is also suspended. The subsequent wakeup order of the two coroutines depends on the return order of rpc. In this scenario, rpcon which thefunction 2` coroutine depends takes only 1 ms and returns first.

Note:

In normal scenarios, the preceding execution order is fixed, and two coroutines do not run in parallel.

However, in actual environments, many factors can interfere. Therefore, when writing code, you should not assume that the order is fixed.

If a specific order is required, use a single coroutine to perform RPC calls instead of splitting them into multiple coroutines.

Sleeping and Waiting

In thread development, under normal CPU conditions, the OS ensures the reliability of the sleep() time and tries to ensure fairness through a strict switching mechanism Therefore, in most scenarios, threads are awakened according to the settings of developers. However, in coroutine development, since the scheduler is the program itself, it is difficult to achieve precise scheduling.

Two coroutines are created at the same time, and task 1 in coroutine 1 is executed first. In the service logic design of coroutine 1, the waiting time between task 1 and task 2 should be waiting time 1. However, after the coroutine is switched out due to waiting, task 3 starts running.

Since task 3 requires a longer CPU time slice than waiting time 1, when coroutine 1 is awakened to continue executing task 2, it is already later than the designed waiting time. The actual completion time of task 2 is later than the completion time designed in the code.

For coroutine 2, since waiting time 2 is longer and task 2 takes a shorter time to execute, it can still be awakened at the planned time point and execute task 4 as expected.

Summary:

When designing coroutines, avoid uneven tasks within coroutines and avoid performing long-term blocking operations in a single coroutine to prevent the scenario encountered by coroutine 1.

Processes and Worker Threads

In a Skynet process, the system creates a main thread, a health monitoring thread, a timer thread, an I/O thread, and multiple worker threads based on the configuration. A single service runs in only one worker thread at a time, and multiple services can run simultaneously. Therefore, the number of worker threads is less than or equal to the number of services, and the number of worker threads equals the number of components that can run at the same time.

From this, we can see that if a Skynet process has only one service, the actual system overhead is wasteful. However, if the system has only one process, it is also a waste of OS resources. Similarly, if a process is configured with only one worker thread, then only one component can run in the process at a time. Therefore, the ratio of the number of processes to the number of components in a process must be carefully designed.

In openUBMC, you are advised to merge multiple components into one process. openUBMC has further optimized RPC and data access for components within a process, resulting in very low communication overhead between components. For some large components or components with high requirements for security and stability, a separate Skynet process will be started. For details, see hica.

In the configuration file of a process, you can configure the number of worker threads.

lua
thread = 10

This configuration sets 10 worker threads, which can support the operation of more than 10 components.

Coroutine Queues

In some scenarios, the number of coroutines created is controlled externally, but these coroutines can only access resources serially at a time. In such scenarios, you can use skynet.queue to run coroutines in sequence.

Worker Mechanism

In some scenarios, you need to create threads temporarily for service operations, such as calling C functions that may be blocking. The openUBMC Lua SDK provides a worker mechanism, which offers capabilities similar to those in C/C++ programming for creating threads in Lua. For details, see worker_core Work Thread.