Commit 4146c8e9 authored by Christoph Berg's avatar Christoph Berg

Rewrite architecture.html as README.md.

parent 7d939917
Multi-Version/Multi-Cluster PostgreSQL architecture
===================================================
2004, Oliver Elphick, Martin Pitt
Solving a problem
-----------------
When a new major version of PostgreSQL is released, it is necessary to dump and
reload the database. The old software must be used for the dump, and the new
software for the reload.
This was a major problem for RedHat and Debian, because a dump and reload was
not required by every upgrade and by the time the need for a dump is realised,
the old software might have been deleted. Debian had certain rather unreliable
procedures to save the old software and use it to do a dump, but these
procedures often went wrong. RedHat's installation environment is so rigid that
it is not practicable for the RedHat packages to attempt an automatic upgrade.
Debian offered a debconf choice for whether to attempt automatic upgrading; if
it failed or was not allowed, a manual upgrade had to be done, either from a
pre-existing dump or by manual invocation of the postgresql-dump script.
It is possible to run different versions of PostgreSQL simultaneously, and
indeed to run the same version on separate database clusters simultaneously. To
do so, each postgres instance must listen on a different port, so each client
must specify the correct port. By having two separate versions of the
PostgreSQL packages installed simultaneously, it is simple to do database
upgrades by dumping from the old version and uploading to the new. The
PostgreSQL client wrapper is designed to permit this.
General Architecture idea
-------------------------
The Debian packaging has been changed to create a new package for each major
version. The criterion for creating a new package is that initdb is required
when upgrading from the previous version. Thus, there are now source packages
`postgresql-8.1` and `postgresql-8.3` (and similarly for all the binary
packages).
The legacy postgresql and the other existing binary package names have become
dummy packages depending on one of the versioned equivalents. Their only
purpose is now to ensure a smooth upgrade and to register the existing database
cluster to the new architecture. These packages will be removed from the
archive as soon as the next Debian release after Sarge (Etch) is released.
Each versioned package installs into `/usr/lib/postgresql/version`. In order
to allow users easily to select the right version and cluster when working, the
`postgresql-common` package provides the `pg_wrapper` program, which reads the
per-user and system wide configuration file and forks the correct executable
with the correct library versions according to those preferences. `/usr/bin`
provides executables soft-linked to `pg_wrapper`.
This architecture also allows separate database clusters to be maintained for
the use of different groups of users; these clusters need not all be of the
same major version. This allows much greater flexibility for those people who
need to make application software changes consequent on a PostgreSQL upgrade.
Detailed structure
------------------
### Configuration hierarchy
* `/etc/postgresql-common/user_clusters`: maps users against clusters and
default databases
* `$HOME/.postgresqlrc`: per-user preferences for default version/cluster and
database; overrides `/etc/postgresql-common/user_clusters`
* `/etc/postgresql/version/clustername`: cluster-specific configuration files:
* `postgresql.conf`, `pg_hba.conf`, `pg_ident.conf`
* optionally `start.conf`: startup mode of the cluster: `auto` (start/stop in
init script), `manual` (do not start/stop in init script, but manual
control with `pg_ctlcluster` is possible), `disabled` (`pg_ctlcluster`
is not allowed).
* optionally `pg_ctl.conf`: options to be passed to `pg_ctl`.
* optionally a symbolic link `log` which points to the postgres log file.
Defaults to `/var/log/postgresql/postgresql-version-cluster.conf`.
Explicitly setting `log_directory` and/or `log_filename` in
`postgresql.conf` overrides this.
### Per-version files and programs
* `/usr/lib/postgresql/version`
* `/usr/share/postgresql/version`
* `/usr/share/doc/postgresql/postgresql-doc-version`:
version specific program and data files
### Common programs
* `/usr/share/postgresql-common/pg_wrapper`: environment chooser and program selector
* `/usr/bin/program`: symbolic links to pg_wrapper, for all client programs
* `/usr/bin/pg_lsclusters`: list all available clusters with their status and configuration
* `/usr/bin/pg_createcluster: wrapper for `initdb`, sets up the necessary configuration structure
* `/usr/bin/pg_ctlcluster`: wrapper for `pg_ctl`, control the cluster postgres server
* `/usr/bin/pg_upgradecluster`: upgrade a cluster to a newer major version
* `/usr/bin/pg_dropcluster`: remove a cluster and its configuration
### /etc/init.d/postgresql
This script handles the postgres server processes for each version and all
their clusters. However, most of the actual work is done by the new
`pg_ctlcluster` program.
### pg_upgradecluster
This program replaces postgresql-dump (a Debian specific program).
It is used to migrate a cluster from one major version to another.
Usage: `pg_upgradecluster [-v newversion] version name [data_dir]`
`-v`: specifies the version to upgrade to; defaults to the newest available version.
-- The Debian PostgreSQL maintainers
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="author" content="Oliver Elphick, Martin Pitt" />
<title>Multiversion/Multicluster PostgreSQL architecture</title>
</head>
<body>
<h1>Multi-Version/Multi-Cluster PostgreSQL architecture</h1>
<h2>Solving a problem</h2>
<p>
When a new major version of PostgreSQL is released, it is necessary to
dump and reload the database. The old software must be used for the dump,
and the new software for the reload.</p>
<p>This was a major problem for Red Hat and Debian, because a dump and reload was
not required by every upgrade and by the time the need for a dump is
realised, the old software might have been deleted. Debian had certain rather
unreliable procedures to save the old software and use it to do a dump, but
these procedures often went wrong. Red Hat's installation environment is so
rigid that it is not practicable for the Red Hat packages to attempt an
automatic upgrade. Debian offered a debconf choice for
whether to attempt automatic upgrading; if it failed or was not allowed, a
manual upgrade had to be done, either from a pre-existing dump or by
manual invocation of the postgresql-dump script.</p>
<p>There was once an upstream program called <b>pg_upgrade</b> which could be
used for in-place upgrading. This does not currently work and does not seem to
be a high priority with upstream developers. </p>
<p>It is possible to run different versions of PostgreSQL simultaneously, and
indeed to run the same version on separate database clusters simultaneously.
To do so, each postgres instance must listen on a different port, so each client
must specify the correct port. By having two separate
versions of the PostgreSQL packages installed simultaneously, it is
simple to do database upgrades by dumping from the old version and
uploading to the new. The PostgreSQL client wrapper is designed to
permit this.</p>
<h2>General Architecture idea</h2>
<p>The Debian packaging has been changed to create a new package for each major
version. The criterion for creating a new package is that initdb is required
when upgrading from the previous version. Thus, there are now source packages
<code>postgresql-8.1</code> and <code>postgresql-8.3</code> (and similarly for
all the binary packages).</p>
<p>The legacy postgresql and the other existing binary package names have
become dummy packages depending on one of the versioned equivalents. Their only
purpose is now to ensure a smooth upgrade and to register the existing database
cluster to the new architecture. These packages will be removed from the
archive as soon as the next Debian release after Sarge (Etch) is released.</p>
<p>Each versioned package installs into
<code>/usr/lib/postgresql/<i>version</i></code>. In order to allow users
easily to select the right version and cluster when working, the
<code>postgresql-common</code> package provides the <b>pg_wrapper</b> program,
which reads the per-user and system wide configuration file and forks the
correct executable with the correct library versions according to those
preferences. <code>/usr/bin</code> provides executables soft-linked to
pg_wrapper.</p>
<p>This architecture also allows separate database clusters to be maintained
for the use of different groups of users; these clusters need not all be of the
same major version. This allows much greater flexibility for those people
who need to make application software changes consequent on a PostgreSQL
upgrade.</p>
<h2>Detailed structure</h2>
<h3>Configuration hierarchy</h3>
<table cellpadding="6" cellspacing="0" border="1">
<tr><td><code>/etc/postgresql-common/user_clusters</code></td> <td>maps users
against clusters and default databases</td></tr>
<tr><td><code>$HOME/.postgresqlrc</code></td> <td>per-user preferences for
default version/cluster and database; overrides
<code>/etc/postgresql-common/user_clusters</code></td></tr>
<tr>
<td><code>/etc/postgresql/<i>version</i>/<i>clustername</i></code></td>
<td>Cluster-specific configuration files:
<ul>
<li><code>postgresql.conf</code>, <code>pg_hba.conf</code>, <code>pg_ident.conf</code></li>
<li>optionally <code>start.conf</code>: startup mode of the
cluster: <code>auto</code> (start/stop in init script),
<code>manual</code> (do not start/stop in init script, but manual
control with <code>pg_ctlcluster</code> is possible), <i>disabled</i>
(<code>pg_ctlcluster</code> is not allowed).</li>
<li>optionally <code>pg_ctl.conf</code>: options to be passed to pg_ctl.</li>
<li>optionally a symbolic link <code>log</code> which points to
the postgres log file. Defaults to
<code>/var/log/postgresql/postgresql-</code><i>version</i><code>-</code><i>cluster</i><code>.conf</code>.
Explicitly setting <code>log_directory</code> and/or
<code>log_filename</code> in <code>postgresql.conf</code>
overrides this.</li>
</ul>
</td>
</tr>
</table>
<h3>Per-version files and programs</h3>
<table cellpadding="6" cellspacing="0" border="1">
<tr><td><code>/usr/lib/postgresql/<i>version</i></code></td> <td colspan="0" rowspan="3" valign="middle">version specific program and data files</td></tr>
<tr><td><code>/usr/share/postgresql/<i>version</i></code></td></tr>
<tr><td><code>/usr/share/doc/postgresql/postgresql-doc-<i>version</i></code></td></tr>
</table>
<h3>Common programs</h3>
<table cellpadding="6" cellspacing="0" border="1">
<tr><td><code>/usr/share/postgresql-common/pg_wrapper</code></td> <td>environment chooser and program selector</td></tr>
<tr><td><code>/usr/bin/<i>program</i></code></td> <td>symbolic links to pg_wrapper, for all client programs</td></tr>
<tr><td><code>/usr/bin/pg_lsclusters</code></td> <td>list all available clusters with their status and configuration</td></tr>
<tr><td><code>/usr/bin/pg_createcluster</code></td><td>wrapper for <code>initdb</code>, sets up the necessary configuration structure</td></tr>
<tr><td><code>/usr/bin/pg_ctlcluster</code></td><td>wrapper for <code>pg_ctl</code>, control the cluster <b>postgres</b> server</td></tr>
<tr><td><code>/usr/bin/pg_upgradecluster</code></td><td>Upgrade a cluster to a newer major version.</td></tr>
<tr><td><code>/usr/bin/pg_dropcluster</code></td><td>remove a cluster and its configuration</td></tr>
</table>
<h3>psql</h3>
<p>We have abandoned the old non-standard error abort if a connection database
is not specified; psql is not expected to be run directly and all
connection parameters should be provided by pg_wrapper as specified above. In
addition, if no explicit default database is specified in
<code>user_clusters</code>, the default database will correspond to the user
name, thus reintroducing the default upstream behaviour.</p>
<h3>/etc/init.d/postgresql</h3>
<p>This script now handles the postgres server processes for each version and
all their clusters. However, most of the actual work is done by the new
<code>pg_ctlcluster</code> program.</p>
<h3>pg_upgradecluster</h3>
<p>This new program replaces postgresql-dump (a Debian specific program).</p>
<p>It is used to migrate a cluster from one major version to another.</p>
<p>Usage: <code>pg_upgradecluster</code> [<code>-v</code> <i>newversion</i>]
<i>version name</i> [<i>data_dir</i>]</p>
<p><code>-v</code> <i>version</i> specifies the version to upgrade to; defaults
to the newest available version.</p>
<p><i><a href="mailto:pkg-postgresql-public@lists.alioth.debian.org">The Debian
PostgreSQL developers</a></i></p>
</body>
</html>
......@@ -5,6 +5,7 @@ postgresql-common (190) UNRELEASED; urgency=medium
* postgresql-common recommends e2fsprogs, we are using chattr in
pg_createcluster. (Closes: #887251)
* PgCommon.pm: Fix include directives parser, spotted by ironhalik, thanks!
* Rewrite architecture.html as README.md.
-- Christoph Berg <myon@debian.org> Mon, 25 Dec 2017 18:03:38 +0100
......
architecture.html
README.md
debian/README.Devel
doc/dependencies.png
systemd/README.systemd
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment