Application Startup Sequence¶
The most common way to start up a Baseplate.py application is to run one of the the baseplate-serve script. This page explains exactly what’s going between that command and your application.
Contents
Note
baseplate-script
is another way to run code in a Baseplate.py
application that is generally useful for ephemeral jobs like periodic crons
or ad hoc tasks like migrations. That script follows a much abbreviated form
of this sequence.
The Python Interpreter¶
Because this is a Python application, before any code in Baseplate or your application can run, the Python interpreter itself must set itself up.
There are many many many steps involved in Python’s startup sequence but for our purposes the most important thing to highlight are a number of environment variables that can configure the interpreter.
Now that the interpeter is up, it runs the actual program we wanted it to
(baseplate-serve
) and the Baseplate.py startup sequence begins.
Gevent Monkeypatching¶
Before doing anything else, Baseplate.py monkeypatches the standard library to use Gevent, a library that transparently makes Python asynchronous. This allows us to simulate simultaneously processing many requests by interleaving their work and switching the CPU between them as they wait for IO operations like network requests. Monkeypatching replaces most of the APIs in the Python standard library that can block a process with ones provided by Gevent which take advantage of the blocking to swap to other work.
Warning
While Gevent gives us easy concurrency, it does not give us
parallelism. Python is still only fundamentally processing these requests in
one thread, one task at a time. Keep an eye out for code that would not
yield to other tasks, like CPU-bound loops, APIs that don’t have
asynchronous equivalents (like flock()
), or dropping into
gevent-unaware native extensions. See the blocked_hub
monitor in
Prometheus Exporter for a tool that can help debug this class of problem.
Monkeypatching is done as early as possible in the process to ensure that all other parts of the startup sequence use the monkeypatched IO primitives.
For more details on Gevent and how it works, see gevent For the Working Python Developer.
Extending the PYTHONPATH
¶
Python uses a list of directories, sourced from the environment variable
PYTHONPATH
, to search for libraries when doing imports. Because it’s common
to want to run applications from the current directory, Baseplate.py adds the
current directory to the front of the path.
Listening for signals¶
Baseplate.py registers some handlers for signals
that allow
the outside system to interact with it once running. The following signals have
handlers defined:
SIGUSR1
Dump a stack trace to
stdout
. This can be useful for debugging if the process is not responsive.SIGTERM
Initiate graceful shutdown. The server will stop accepting new requests and shut down as soon as all currently in-flight requests are processed, or a timeout occurs.
SIGUSR2
Same as
SIGTERM
. For use with Einhorn.SIGINT
Same as
SIGTERM
. For Ctrl-C on the command line.
Parsing command line arguments¶
Command line arguments are parsed using the Python-standard argparse
machinery.
baseplate-serve
only requires one argument: a path to the configuration
file for your service. The optional arguments --app-name
and
--server-name
control which sections of the config file are read. The
remaining options control the way the server runs.
Parsing the configuration file¶
Baseplate.py loads the configuration file from the path given in command line.
The raw file on disk is parsed using a configparser.ConfigParser
with interpolation disabled.
Configuration files are split up into sections that allow for one file to hold
configuration for multiple components. There are generally two types of section
in the config file: application configuration sections that look like
[app:foo]
and server configuration sections that look like
[server:bar]
. After parsing the configuration file, Baseplate.py uses the
section names specified in the --app-name
and --server-name
command
line arguments to determine which sections to pay attention to. If not
specified on the command line, the default section name is main
. For
example, baseplate-serve --app-name=foo
would load the [app:foo]
and
[server:main]
sections from the config file.
Note
If you use multiple app
or server
blocks you may find
yourself with a lot of repetition. You can move duplicated configuration to
a meta-section called [DEFAULT]
and it will automatically be inherited
in all other sections in the file (unless overridden locally).
The server configuration section is used to determine which server implementation to use and then the rest of the configuration is passed onto that server for instantiation. See Loading the server for more details. The application configuration section determines how to load your application and then the rest of the configuration is passed onto your code, see the Loading your code section for more details.
Configuring Logging¶
Next up, Baseplate.py configures Python’s logging
system. The default
configuration is:
Logs are written to
stdout
.The default log level is
INFO
unless the--debug
command line argument was passed which changes the log level toDEBUG
.A baseline structured logging format is applied to log messages, see the logging observer’s documentation for details.
This configuration affects all messages emitted through logging
(but not
e.g. print()
calls).
If a [loggers]
section is present in your configuration file, logging
is given a chance to override configuration using the standard logging
config file format. This can be useful if you want
finer grain control of what messages get filtered out etc.
Loading your code¶
The next step is to load up your application code.
Baseplate.py looks inside the selected [app:foo]
section for a setting
named factory
. The value of this setting should be the full name of a
callable, like my.module:my_callable
where the part before the colon is a
module to import and the part after is a name within that module. The
referenced module is imported with importlib.import_module()
and then
the referenced name is retrieved with getattr()
on that module object.
Once the callable is loaded, Baseplate.py passes in the parsed settings from
the selected [app:foo]
section and waits for the function to return an
application object. This is where your application can do all of its one-time
startup logic outside of request processing.
Binding listening sockets¶
Unless running under Einhorn, Baseplate.py needs to create and bind a socket
for the server to listen on. The address bound to is selected by the --bind
option and defaults to 127.0.0.1:9090
.
Two socket options are applied when binding a socket:
SO_REUSEADDR
This allows us to bind the socket even when connections from previous incarnations are still lingering in
TIME_WAIT
state.SO_REUSEPORT
This allows multiple instances of our application to bind to the same socket and the kernel distributes connections to them according to a deterministic algorithm. See this explanation of SO_REUSEPORT for more information. This generally only is useful under Einhorn where multiple processes are run on the same host.
Loading the server¶
Baseplate.py now loads the actual server code that will run the main application loop from here on out.
This process is very similar to loading your application code. The factory
setting in the selected [server:foo]
section of the configuration file is
inspected to determine which code to load. This is generally one of the server
implementations in Baseplate.py but you can write your own in your application
if needed. Once loaded, the rest of the configuration is passed onto the loaded
callable.
The new server object has expectations of what kind of application object your
application factory returned. For example, an HTTP server expects a WSGI callable while the Thrift server expects a
TProcessor
object.
Handing off¶
Once everything is set up, Baseplate.py writes “Listening on <address>” to the log and hands off control to the server object which is expected to serve forever (unless one of the signals registered above is received) and use your application to handle requests.