Every database eventually runs into the same wall: storage costs money, and the data nobody queries anymore costs exactly as much as the data everyone does. A five-year-old row occupies the same expensive block storage as the order that came in thirty seconds ago. Postgres doesn't know the difference, and why would it? That's honestly a common refrain among most database engines.As a result, many have dreamed of fixing this by decoupling compute from storage. Push the cold, ancient data down to cheap object storage like S3, keep the hot data local and fast, and let a single query span both. Better yet, store that cold data in an open columnar format for optimal analytics. It's a vision that launched an entire cottage industry of extensions, forks, and re-architectures, each one tackling the problem in its own particular way. But there are so many approaches now, and none have "the" definitive solution. Each contender seems to demand a different sacrifice, whether that's a forked binary, an extra daemon, a duplicate copy of table data, or eschewing writes entirely. Let's explore the external storage ecosystem and see what we find.
Shared posts
Tridgell: rsync and outrage
Andrew Tridgell has written a blog post responding to complaints that he has begun using LLM tools in his work maintaining rsync:
Like many developers of open source packages I've been hit by a flood of security reports lately in my role as the rsync maintainer. Many of those reports are AI generated (not all though, there are some notable ones with very careful and high quality manual analysis).
As this flood started to get more intense I realised I needed to raise the defences on rsync a lot — we needed much more thorough test suites, code coverage analysis, CI testing on a lot more platforms, deliberate and thorough scanning for possible security issues (so I find at least some of them before other people!) and the addition of a whole lot of defence-in-depth hardening techniques.
[...] Now to the future, because we're not done yet by a long shot. The security reports keep rolling in. I'm working on a bunch of CVEs right now. Luckily I've been joined by some other very good developers with great systems development skills and security knowledge. Some of these people came to my attention partly because of all the rage happening at the moment, so I get some rage storm clouds have silver linings. Watch out for some credits for some great new rsync developers in the next release.
Vibhor Kumar: Beyond Vector Search: Why PostgreSQL Could Become the Memory Layer for Enterprise AI Systems

The conversation around AI infrastructure today is heavily focused on models, GPUs, inference speed, and vector databases. These are important building blocks, but they often distract from a deeper architectural challenge that is beginning to emerge as enterprises move from experimentation toward operational AI systems.
The challenge is memory.
Not memory in the simplistic sense of storing chat history or embeddings, but memory in the broader sense of maintaining durable context, operational continuity, historical understanding, workflow state, reasoning traceability, and business awareness across long-running AI interactions.
Many of the current AI systems appear intelligent during a single interaction, yet surprisingly fragile across time. They can summarize documents, answer questions, call APIs, generate code, and reason effectively within a bounded context window. However, once interactions become long-running, collaborative, stateful, and operationally significant, the limitations quickly become visible.
The issue is not necessarily that the models lack intelligence. The issue is that most AI systems today lack a coherent memory architecture.
As enterprises begin deploying agentic AI systems capable of acting autonomously across workflows, applications, and business processes, this gap becomes increasingly important. In many ways, modern AI agents resemble highly capable employees who forget large portions of their institutional knowledge every few hours. They may understand the current task extremely well, but they often struggle to consistently retain, prioritize, organize, and retrieve contextual knowledge accumulated over time.
This is where I believe PostgreSQL may become significantly more important in the future AI stack than many people currently realize.
Not simply as a vector database.
Not merely as storage for embeddings.
But potentially as the durable memory, operational state, and governance substrate for enterprise AI systems.
The Emerging Problem: AI Systems Need Durable Context
Most AI architectures today are designed around inference rather than continuity.
A common modern AI stack often includes:
- a large language model,
- a vector database,
- object storage,
- caching layers,
- workflow engines,
- orchestration frameworks,
- and observability platforms.
Each component solves a specific technical problem, yet very few architectures solve the broader challenge of long-term contextual continuity.
Enterprise AI agents increasingly need to remember:
- customer interactions,
- operational workflows,
- prior decisions,
- tool outputs,
- approvals,
- business policies,
- unresolved actions,
- evolving facts,
- audit history,
- and relationships between entities over time.
This is fundamentally different from traditional chatbot memory.
A support agent assisting a customer over several weeks must understand not only the latest interaction, but also previous escalations, policy changes, sentiment shifts, outstanding tasks, fraud indicators, and prior resolutions. Similarly, an AI system supporting financial workflows may need to track evolving regulatory constraints, approval chains, transaction histories, and operational exceptions across multiple systems and users.
The challenge quickly evolves from “retrieving relevant documents” into maintaining a coherent operational memory model.
This is where many current AI systems begin to struggle.
Why Vector Search Alone Does Not Solve the Problem
Vector search is an important capability, but semantic similarity by itself is not sufficient for enterprise memory systems.
A vector database can identify information that is semantically related to a query. That is valuable. However, enterprise AI systems require far richer contextual reasoning.
They must answer questions such as:
- Which information is most recent?
- Which facts supersede older facts?
- Which memories are trustworthy?
- Which records are associated with this workflow?
- Which information is the user authorized to access?
- Which decisions caused downstream operational changes?
- Which unresolved tasks remain active?
- Which events are connected temporally or causally?
These are not purely vector problems.
They are relational, transactional, temporal, and governance problems.
An embedding may indicate that two pieces of information are semantically similar, but enterprise systems also require contextual filtering, consistency guarantees, auditability, permissions, workflow awareness, and business reasoning.
This distinction becomes extremely important as organizations move from AI experimentation toward operational AI platforms.
Why PostgreSQL Fits This Space Surprisingly Well
PostgreSQL is uniquely interesting because it already combines many of the foundational capabilities required to support enterprise AI memory architectures within a single platform.
PostgreSQL provides:
- relational consistency,
- JSON document flexibility,
- full-text search,
- vector similarity through pgvector,
- transactional guarantees,
- event and time-series storage,
- mature indexing,
- replication and high availability,
- row-level security,
- governance controls,
- and a highly expressive SQL engine.
Most importantly, PostgreSQL allows these capabilities to coexist within the same operational system.
This matters because enterprise memory is not a single datatype.
An AI memory platform must simultaneously support:
- structured operational data,
- semantic embeddings,
- historical events,
- workflow state,
- relationships between entities,
- audit trails,
- contextual retrieval,
- and policy enforcement.
Many organizations are currently attempting to assemble this capability by stitching together multiple independent systems. A vector database handles semantic search, Redis manages short-term state, object storage retains documents, graph databases model relationships, workflow engines track execution state, and separate audit systems manage compliance requirements.
While this approach can work, it often creates fragmentation. Context becomes distributed across multiple systems with different consistency models, retrieval semantics, security boundaries, and operational characteristics.
PostgreSQL presents a compelling alternative because it can unify many of these concerns into a coherent operational architecture.
Thinking About AI Memory More Like Human Memory
One of the most useful ways to think about this problem is through the lens of human memory.
Humans do not treat all memories equally. We maintain multiple layers of memory operating simultaneously:
- short-term memory,
- episodic memory,
- semantic memory,
- procedural memory,
- and long-term memory.
Enterprise AI systems increasingly require similar structures.
Short-term memory may represent the active context of the current interaction. Episodic memory may capture historical conversations and operational events. Semantic memory may store embeddings, concepts, and generalized knowledge. Procedural memory may represent workflows, business policies, and operational sequences. Long-term memory may preserve durable organizational knowledge and audit history.
PostgreSQL is well positioned to support these layered memory models because it can combine structured, semi-structured, and semantic representations within a unified transactional system.
A PostgreSQL-Based Agent Memory Architecture
A practical architecture for enterprise AI memory may look something like this:
The user interacts with an AI agent or application layer. The agent communicates with a context orchestration layer responsible for assembling relevant operational memory before invoking the language model.
The PostgreSQL layer stores:
- conversations,
- events,
- workflow state,
- tasks,
- documents,
- embeddings,
- relationships,
- tool outputs,
- permissions,
- and audit history.
When a request arrives, the orchestration layer retrieves:
- semantically relevant information,
- recent operational context,
- workflow-specific state,
- user-authorized knowledge,
- and business-critical metadata.
The language model receives only the context necessary for the current task.
This architectural principle is extremely important.
The language model should not be responsible for remembering everything.
The database should.
The AI system should dynamically retrieve and assemble the right context at the right time.
The Rise of Context Engineering
Over time, I believe the industry will shift from emphasizing prompt engineering toward emphasizing context engineering.
Prompt engineering optimizes how instructions are written.
Context engineering optimizes how operational memory is assembled before inference occurs.
This includes:
- semantic retrieval,
- structured filtering,
- temporal ranking,
- permission enforcement,
- workflow awareness,
- recency weighting,
- trust evaluation,
- and contextual summarization.
For example, an enterprise system may need to answer:
“Find unresolved customer escalations related to failed transactions during the last 30 days, prioritize high-value accounts, exclude superseded tickets, and retrieve associated sentiment history.”
This is not a pure vector query.
It is a hybrid reasoning problem combining semantics, structure, business logic, time awareness, and governance.
This is precisely where PostgreSQL’s combination of SQL, JSON, transactional guarantees, and vector search becomes extremely powerful.
What Still Needs To Be Built
PostgreSQL alone is not yet a complete enterprise memory platform for AI agents.
The foundation exists, but higher-level orchestration capabilities still need to mature.
The industry still needs:
- intelligent context assembly frameworks,
- memory lifecycle management,
- temporal reasoning engines,
- hybrid retrieval orchestration,
- memory summarization and compression,
- relevance ranking systems,
- and governance-aware evaluation frameworks.
In particular, memory lifecycle management will become increasingly important as AI systems accumulate enormous volumes of historical state. Not all memories should remain equally accessible forever. Systems will require mechanisms for summarization, archival, prioritization, abstraction, and controlled forgetting.
Similarly, enterprises will require stronger governance models capable of explaining:
- why a decision was made,
- which context influenced the outcome,
- which tools were invoked,
- and how operational reasoning evolved over time.
These requirements align naturally with PostgreSQL’s strengths in transactional consistency, auditability, and operational reliability.
Why This Matters for Enterprise AI
Consumer AI applications can tolerate approximation and inconsistency more easily.
Enterprise AI systems often cannot.
Industries such as financial services, healthcare, insurance, telecommunications, and government operate within environments where traceability, governance, durability, and consistency are not optional requirements.
An AI system making operational decisions inside enterprise workflows must provide:
- reliable memory,
- contextual continuity,
- auditability,
- permissions enforcement,
- governance controls,
- and transactional integrity.
This is one reason why the future of enterprise AI may depend less on isolated models and more on durable operational architectures surrounding those models.
In many ways, this resembles earlier phases of enterprise computing history. Databases became foundational not simply because they stored data, but because they provided consistency, durability, governance, and operational trust.
AI systems are beginning to encounter similar requirements.
Final Thoughts
The AI industry is currently focused heavily on models, inference benchmarks, and agent frameworks. However, one of the most important long-term architectural questions may ultimately become much simpler:
How does the system remember reliably over time?
Not merely retrieve semantically similar text.
Not simply persist conversation logs.
But maintain coherent operational memory across workflows, users, systems, and decisions.
Because intelligence without durable memory eventually becomes unreliable automation.
And unreliable automation rarely survives inside enterprise systems.
The opportunity ahead may not simply be “PostgreSQL for vectors.”
It may become PostgreSQL as the durable memory and operational state substrate for enterprise AI systems.
Christophe Pettus: PostgreSQL 19 Beta: The Four Features You’ll Actually Feel
extremely low frequencies
The submarine is a surprisingly ancient technology—at least in its early, primitive forms. The idea is quite simple, that a well-enough-sealed boat ought to be able to submerge and resurface. It's the practicalities that make the whole thing difficult. It is generally considered that the US Civil War was the first use of submarines in combat; these were primitive machines with very limited operating endurance and navigational capabilities. These submarines were more like torpedoes: you pointed them in the right direction and hoped they went straight.
The First World War benefited from tremendous advances in submarine technology. A number of experimental designs during the 19th century had built practical experience, especially in Germany, and the Germans apt use of the first modern "U-boats" had a significant military impact. British and US designs made similar advances, and submarine warfare was born.
The chief advantage of the submarine is its ability to submerge and maneuver while hidden. WW1 submarines were diesel-electric or gasoline, so their submerged endurance was limited by the power supply stored onboard. Still, these submarines could operate underwater longer than any before, long enough to establish the submarine sneak attack as a key part of naval warfare.
It was also long enough to expose one of the trickiest challenges of underwater defense: communications. Water, especially seawater, is dense and conductive. This is very bad for radio wave propagation: by the first world war it had already been discovered that seawater effectively blocked radio communications. HF radio, the main form of communications at sea (and, in the WW1 era, in general) might only penetrate seawater for a few meters in real-world That meant that submarines had to surface in order to communicate, another de facto limitation on their endurance while submerged.
The Navy had been evaluating electronic communication aboard ships since 1887, when they demonstrated a simple and "radio-adjacent" technology using conduction of waves through the seawater itself. This scheme never worked very well, but was saved by the development of modern wireless transmitters late in that century. Marconi himself demonstrated radio to the Navy in 1899, and in 1903 the Navy bought its first radio sets. Tactical reports from conflicts elsewhere on the globe, like the Russo-Japanese war, reinforced the idea that radio would serve a key role in naval combat.
When C-class submarines Stingray and Tarpon, and D-class Narwhal, launched in 1909, they were immediately given duties including the evaluation of radio equipment. In a classic tale of early technology, the evaluations went poorly. Tarpon ran into mechanical trouble that prevented its planned trial voyage, so the radio set was never installed. Stingray received a cutting-edge quenched spark gap transmitter and receiver set, but the transmitter turned out to be DOA. Still, Stingray was able to demonstrate its receivers, copying a message from the nearby Boston Navy Yard while surfaced.

Narwhal's mission was more ambitious: underwater communication. A test was made on the same direct conduction technology, using brass plates suspended below the ships, demonstrated in 1887. It similarly failed to perform. A repetition of those experiments, done the next year and with improved equipment aboard Narwhal's sister ship Grayling, produced better results. The system provided reliable communications with the "antenna" plates submerged as much as two feet below the water... and no deeper. Frustrated Navy engineers concluded that it was possible to get radio signals through seawater, but not practical.
Through the First World War and following decades, engineers focused on ways to get the antenna to the surface without having to bring up the entire submarine. Around 1915, the Navy adopted a floating antenna buoy that a submarine could "winch up" towards the surface on a cable. Putting anything at the surface was less than ideal, but the anti-submarine technology of the era the small antenna buoy was still very difficult to detect at long range. Submarines just had to make sure it was retracted back to the submarine's deck before attempting anything where stealth was key. These floating buoys were not reliable during WW1, but they could work, and the technology has continued to develop to this day.
Still, there were other ideas about underwater communications. The most important development came from two engineers of the National Bureau of Standards (NBS), or at least, that's what a court ruled after a patent dispute between two sets of supposed inventors. John Willoughby was employed by the NBS, which would later be known as the National Institute of Standards and Technology (NIST), to investigate new types of radio receivers. In the summer of 1917, he was arranging various types of coil antennas at a receiver test site on the Chesapeake Bay when he accidentally dropped one of the antennas into the water. Strangely enough, the radio receiver connected to the antenna continued to provide good reception even as it sank into the bay.
NBS management was not especially enthusiastic about this accident, but Willoughby was. He knew that the Navy was investigating means of communication with submarines, and that seawater seemed to block radio waves, all of which suggested that he might have stumbled on an important discovery. Lacking NBS support for further research, he took the idea to gifted radio inventor and NBS colleague Percival Lowell 1. In a fine tradition of innovation, the two took to Willoughby's basement for a series of experiments that illuminated the underlying phenomenon: Willoughby had been experimenting with unusually low radio frequencies, below 30kHz where wavelengths become too long for most antenna designs and coils become the best receivers. These lower frequencies were significantly less affected by water than higher, more conventional frequencies, and Willoughby and Lowell built a successful prototype for what they called "long-wave" radio between two coils.
The NBS remained surprisingly uninterested, but Willoughby had a contact in the Navy who felt quite differently. In 1918, Willoughby and Percival joined LtCmd H. P. LeClair, then running the Navy's experimental radio program, at submarine base New London (so named after New London, Connecticut, across the Thames River (Connecticut) from the base). They made a hurried and rough installation of their equipment on submarine D-1 and a surface support vessel. Not everything went perfectly, but they proved the idea: Willoughby, Lowell, and LeClair listened attentively to their radio sets as the D-1 submerged and continued to come in loud and clear.
Within a matter of a few years, the Navy accepted long-wave radio as a standard technology for submarine communications. The various jury-rigged installations at New London showed that coil antennas could easily be integrated into a submarine's rigging, and even better, the Navy had found that long-wave radio propagated over the surface as well as under it. Long-wave communications would serve the entire Navy, and a transmitter site was already underway.
Long-range communications had become a top concern throughout the military in the early 20th century, and a series of meetings between US military branches and between the US and UK led to a scheme of "High Power" radio stations. The first of these, NAA, went up near Arlington, Virginia in 1913. Over the following years, similar stations were built in the US and Europe, facilitating the first direct communications between the two and the first transatlantic voice communication in 1915. The construction and operation of these stations also led to considerable advances in radio technology generally, especially powerful transmitters. NAA was one of the early stations to be equipped with Poulson arc transmitters, almost two times more efficient than earlier designs and well-suited to long-wave operation.
Around the same time as the Willoughby/Lowell experiments, Navy engineer LtCdr Albert Taylor found similar results with long-wire antennas shallowly under the water. These experiments offered another design for concealed submarine antennas (which could be stored onboard in reels and let out with floats that kept them just under the surface), and also demonstrated that long-wire antennas could be buried for transmit use.
Five years later, in 1918, construction was underway on NSS—a new high power station in Annapolis, Maryland. Unlike those before, NSS was specifically designed for long-wave signals. Two 500 kW Poulson arc transmitters driving an antenna 400' square and suspended between four 500' tall towers 2. The long-wave capability at Annapolis was not originally intended for submarine communications, but it quickly fell into that niche. During the 1920s, NSS became a key station for submarine command and control of submarines.
NSS itself remained in service until 1996, and it was joined by VLF transmitters at Cutler, Maine; Jim Creek, Washington; Lualualei, Hawaii; LaMoure, North Dakota; and Aguada, Puerto Rico; besides sites in Europe operated with allied militaries. Each of these stations is its own interesting story. The 1,205' VLF antenna tower at Aguada remains the tallest structure in the Caribbean. LaMoure was originally built in the 1960s for a long-wave navigation system called Omega, and was repurposed for submarine C2. Jim Creek went into service in 1952 as the most powerful radio transmitter in the world, using a fascinating antenna that draped from one ridge to another across a mountain valley.
Let's focus, though, on Cutler. VLF Transmitter Cutler is the spiritual descendant of the Navy's original High Power program, symbolized in its inheritance of the callsign NAA. Cutler was part of a Cold War expansion of the VLF system, going into service in 1961. Many other VLF sites received upgrades around the same period, but Cutler was a completely new design. Cutler's two antennas, for redundancy, are each supported by 13 towers. The center tower is about 1,000' tall, and the other 12 make up two concentric rings of about 900' height. The complete antenna is over 6,000' across, or nearly 2 km. Between the tower tops stretches a web of tight horizontal wires, each 1" copper, that form an enormous capacitor. The capacitor's other plate is the ground, electrically reinforced by many miles of buried groundplane wires. The radiating elements are vertical wires, hanging down from the upper horizontal mesh.

In Maine's harsh winters, the wires accumulate ice until their weight threatens the towers. Each antenna is alternately switched into a deicing mode in which it is turned into a 3 MW heating element... just for long enough that the ice melts off. Outer towers are supplemented by short, stout structures that allow the 220 ton tension weights to move up and down on tracks. "Helix houses" at the feedlines of the two antennas sheltered enormous inductors; walls lined with copper served as insulation and to ground the occasional arcs that made the helix houses and transmitter rooms unsafe to enter during operation.
The two antennas were driven by a transmitter complex designed and built by Continental Electronics. The 11 MW on-site power plant supplied the AN/FRT-31 transmitter, custom to this installation, consisting of four parallel units of eight ML-6697 transmitter tubes. The transmitter's control room rivaled that of many power plants, as did its output: the military required at least 1 MW, Continental rated the transmitter for just over 2 MW, and it still operates today at powers as high as 1.8 MW. There are several reasons that the "most powerful radio station in the world" is now difficult to pin down, but NAA Cutler is certainly in the running.
That is the end of the VLF story, in that it hasn't ended. The original 1910s and 1920s VLF sites are mostly decommissioned, but only because they have been replaced by more modern equipment, sometimes on the same site. Cutler, Jim Creek, Lualualei, and Aguada are all still in service. LaMoure may be in some kind of mothballs state but is definitely capable of operating, it has recently seen some use for propagation experiments. VLF is still a key technology in the Navy's C2 and nuclear reprisal plans. So, we can say that VLF has achieved one of the great feats of technical history: it has outlived its replacement.
First, though, we should spend some more time on the theory. In modern parlance, "VLF" describes the band from 3-30 kHz. Most Naval VLF stations operate at around 24 kHz, but some stations support lower frequencies as well and other stations have operated as high as 40 kHz (still considered VLF by the Navy for practical purposes). These wavelengths pass through seawater well because of a basic trait of radio waves that was becoming experimentally apparent in the 1920s and received a thorough theoretical underpinning later. Radio waves attenuate as they pass through materials in proportion to the number of wavelengths in the material. In other words, as a rule of thumb, a radio wave with a 12 m wavelength (~24 MHz) will experience about 1,000 times the attenuation of a signal with a 12,000 m wavelength (~24 kHz). This is true of water or air or any other material, but the attenuation rate in saltwater is so high that the effect is extremely apparent in the sea.
This brings us to our first property of VLF: because of the long wavelength of VLF signals, they pass through water with relatively little attenuation. Still, there is a limit. The details of submarine communications are mostly classified, but from open materials it is realistic for a submarine to receive a VLF transmission up to about 100' below the surface. This depth is already far better than what's achievable with HF, and far superior to deploying a floating buoy. Still, intuition dictates that even lower frequencies could be even better, and the Navy did not go without noticing that possibility.
Second, we should revisit the antennas. One of the key insights of early experimenters like Willoughby and Lowell is that coil antennas create an asymmetry in radio communications. Antennas become more efficient as they reach the wavelength of the signal, or multiples thereof. That means that lower frequencies, and longer wavelengths, require larger antennas—thus the 6,000' wide cobwebs at Cutler and more than one regional height record set by VLF antenna towers. On the other hand, coil antennas, or more specifically magnetic loop antennas, can be very small compared to the wavelength they receive.
Unfortunately, the physics trick that makes magnetic loop antennas work so well (magnetic coupling) is basically one-way. Magnetic loop antennas are relatively inefficient but usable for reception; they're completely useless for transmitting. VLF is effectively a one-way technology, and some of the traffic carried by the Navy's VLF network consists simply of orders for submarines to surface or deploy a buoy for more advanced communications.
Finally, we should observe that the capacity of a radio channel to carry information is proportional to its bandwidth, and that the use of lower frequencies and longer wavelengths makes the usable bandwidth of given radio equipment much smaller (we can intuitively understand this by noting that larger antennas are, simply due to scaling, more precisely tuned to their intended wavelength than smaller antennas). VLF transmitters are only capable of very narrow transmissions, functionally limiting them to continuous wave (Morse code) operation or simple digital schemes at very low speeds.
We probably all realize, as did the Navy, that pushing to yet lower frequencies and longer wavelengths would produce better penetration of the seawater, at the cost of basically every other property becoming worse: larger antennas, less efficient transmitters and receivers, narrower bandwidths. The possibility of going even further—from Very Low Frequency to Extremely* Low Frequency—was just a solution in wait of a problem. The military had a lot of those, and the Cold War was one huge problem.

The idea of a nuclear-powered submarine is almost as old as the nuclear program, and a collaboration between the Navy, the Atomic Energy Commission, and famed admiral Hyman Rickover led to the 1954 launch of nuclear-powered submarine Nautilus. The next decade gave the Electric Boat Company new meaning, as nuclear propulsion displaced diesel in the US submarine fleet and fundamentally changed the strategy of submarine warfare. Nuclear submarines, unlike those using diesel-electric or gasoline propulsion, can be set up to remain submerged almost indefinitely. The reactor does not require air, and provides plentiful power for life support equipment that mitigates the fresh air requirement for everything else. This created a generational change: by some definitions, all pre-nuclear submarines were merely submersibles, ships designed to submerge only temporarily. The nuclear submarine was a new kind of creature, one that not only visited the depths but could live there.
Add in the development of submarine-launched ballistic missiles (SLBMs), which enabled a submarine to direct nuclear weapons at targets on shore with shorter travel time than any other means of delivery. Every submarine became a portable missile silo, one that could not only hide but actively evade detection. Their ideal mission was to lurk, undetected, for extended periods of time.
Of course, this new potential for submarines further stressed communications infrastructure. A nuclear submarine might spend weeks submerged in water that is ostensibly controlled by another nation, making stealth critical. Such a submarine doesn't want to remain close to the surface, which makes detection by all means easier, and also doesn't want to deploy floating buoys or antennas that are easily detected by modern radar. On the other hand, for it to have any value as a nuclear deterrent, the Navy needs some way to deliver a launch order without having to wait for the next duty rotation.
The military spent the early Cold War developing a dozen different systems for survivable delivery of nuclear war orders, things like the High Frequency Global Communications System (HFGCS) and TACAMO that solidified the concept of short, simple, one-way Emergency Action Messages to direct nuclear forces. The Navy needed a way to deliver EAMs to submerged submarines, and that provided the impetus to investigate lower frequencies than ever before.
The lowest generally recognized radio band, ITU band 1, is Extremely Low Frequency or ELF. There is some historic complexity around the definition of ELF, and the modern range of 3-30 Hz does not exactly match the way the Navy has used the term. In general, though, we can consider ELF to refer to the very bottom end of the usable radio spectrum. The extreme lower edge could be said to fall around 7 Hz, where the wavelength of a radio signal matches the circumference of the earth. This leads not only to complex interference problems due to constructive and destructive interactions, it also produces a very high noise floor as global lightning storms trigger perturbances that resonate on and on. Balancing the desire for the lowest possible frequency against the practical challenges of ELF, the Navy settled on the range of 72-80 Hz as the most promising window for submerged submarines.
The history of Naval ELF development is not simple to research. First, the Navy conducted much of its ELF research in secrecy, a result of typical Cold War paranoia and an awareness that the Soviet Union was pursuing a similar idea. Second, much like GWEN, ELF became the locus of fervent public opposition grounded in general anti-war sentiment, demands for nuclear disarmament, and the safety of electromagnetic radiation. Many of the readily available sources on ELF history today come from "electrosensitive" advocates or newsletters, a still-strong movement founded on the mostly unscientific premise that EM fields pose a danger to human health. While mostly factually accurate, these sources require some caution since they tend to mix their historical narrative with observations about EM and RF safety that are now broadly considered pseudoscientific. Still, this frustration leads to two positive outcomes: first, it helps to place the development of ELF radio within a broader cultural context of uncertainty about both war and new technology, emphasizes the unknowns involved in the push to ELF, and makes the ELF stations an interesting focus of the anti-war movement. Second, it leads to a personal connection that likely contributed a great deal to my interest in military communications.
There are rumors, even scant evidence, that the Navy initiated classified experiments with ELF in the late 1950s. There is very little that I can say about this first part of ELF history, besides that the experiments must have had promising results. In 1968, the Navy adopted a full-scale ELF communications plan called Project Sanguine.
The original Sanguine proposal was truly an artifact of the Cold War, remarkable in its scale and doomed to obsolescence before construction even began. The Sanguine ELF station would actually be over one hundred independent transmitting stations, operating in synchronization as a form of hardening. The loss of a subset of those stations, say due to nuclear attack, would only reduce power rather than disabling the entire facility. Of course, to maximize survivability of the individual transmitters, they would all be installed in hardened underground bunkers, each with a set of 2" antenna cables extending 40 or more miles in four directions. The overall layout of stations and antennas created a grid with antenna elements spaced every 3-5 miles, covering a total of some 6,500 square miles. That's larger than Connecticut, but smaller than New Jersey. Perhaps more apropos, it is about 1/10 the area of Wisconsin, the state where the Navy planned to install the system 3.
This underscores a fundamental problem with ELF: antenna sizes. At 80 Hz, the wavelength of a radio wave is 2,300 miles, or about one quarter of the diameter of the earth. Take, for example, a half-wave dipole antenna—a very common antenna design in most bands. For ELF, the antenna would need to stretch from Albuquerque to Portland. Clearly, then, any practical ELF antenna needs to be "electrically short" or, in the relative sense of RF engineering, a small antenna. Small antennas are inefficient, and the smaller they get the less efficient they are. Complicating things further, practical ELF propagation over the surface of the earth requires vertically polarized waves. That means a vertically polarized antenna, and there is simply no way to construct a tower that is hundreds of miles tall.
Sanguine proposed, and most later ELF projects adopted, a style of antenna called a ground dipole. A ground dipole is basically two different electrodes, or grounding rods, driven into the ground a great distance apart and connected by feedlines. The power from the transmitter goes through the electrodes into the ground, where it flows as ground current from one end of the antenna to the other. The ground dipole thus forms a loop, with the feedlines as one side and the ground as the other. The actual RF emission results from the magnetic field between the feedlines above ground and the current flowing beneath, somewhat like the VLF antenna at Annapolis if half of it was buried beneath the ground.
Ground dipoles, like a typical dipole antenna, are directional. They emit RF most strongly in the same axis as the antenna, with strong lobes extending away from the ends of the two feedlines. By installing a second antenna on a perpendicular axis and shifting the phase between the two, you can create a steerable antenna with its strongest lobes pointed in the direction of your choice. That's why the Sanguine proposal, and most ELF transmitters after, have used two ground dipoles in a crosswise layout.
I put a lot of time into writing this, and I hope that you enjoy reading it. If you can spare a few dollars, consider supporting me on ko-fi. You'll receive an occasional extra, subscribers-only post, and defray the costs of providing artisanal, hand-built world wide web directly from Albuquerque, New Mexico.
During the 1960s, the Navy performed a series of poorly documented experiments to establish the feasibility of Sanguine. These included a Wyoming power transmission line that was temporarily disconnected for use as an ad-hoc 40 mile antenna, and a power-line-like 110 mile antenna built by RCA in North Carolina and Virginia. The details of this RCA experiment, part of Project Pangloss, have become obscure. It appears that RCA was contracted to evaluate a number of different communications options for the Navy, including the use of other planets in the solar system as passive repeaters, but most of them didn't work out. The VLF transmitter for the project was located at Ararat, North Carolina, and the two two electrodes at Algoma, Virginia and Lake Lookout, North Carolina. A 1963 test successfully got a message from the test antenna to a submarine submerged 150' deep and 520 miles from the transmitter.
Like most of the military's ambitious plans in the late 1960s, Project Sanguine didn't happen. The reasons are complex, or at least several. Sanguine was unpopular with the public: besides specific concerns around safety, the late '60s saw a rising anti-nuclear campaign and a general lack of interest in enormously expensive military undertakings. The fact that Sanguine needed a massive amount of land meant that it was pretty much impossible to site it somewhere that wouldn't generate local opposition, so like ICBM fields, Sanguine was kicked around like a football. Originally planned for Wisconsin, it later shifted to Texas, and Texas didn't like it that much either (although by that point the antenna field had been downsized to just 1,600 to 3,200 square miles). And, of course, the technology was struggling to keep up with the threat landscape. The hardened design of Sanguine relied mostly on the idea that the Soviet Union couldn't possibly nuke most of the transmitters distributed over 6,500 square miles, a reassurance that the development of multiple independent reentry vehicles (MIRVs) seriously undermined.
As public opposition formed, a health and safety review commissioned by the Navy resulted in a noncommittal report that did little to reassure the public (and lawmakers) that the plan was safe. Last of all, but certainly not least, the budget projections for Sanguine were formidable, and Congress did not have the appetite for the spending.
Sanguine made it far enough that, during 1968, the Navy and RCA built a scaled down transmitter and antenna in the Chequamegon National Forest of Wisconsin. This came to be known as the Wisconsin Test Facility, and it was used as a transmitter for a series of jamming tests in the late '60s and early '70s. During this period, the Navy also considered the use of a BPA transmission line from The Dalles, Oregon to Los Angeles as an ELF transmitter—the plan being to actually modulate messages onto the 60 Hz AC power carried by the line, which was incidentally radiated due to the line's largely straight 850 mile span. This plan was called PISCES, and it is unclear if it ever went anywhere, although an interesting rumor holds that it was operational for a short period and used as the "jammer" transmitter for jamming susceptibility testing of the Wisconsin transmitter.
The results of these tests were mostly positive, but that wasn't enough to save an unpopular plan. Sanguine faded away, perhaps replaced by a scaled-down system called Super Hard ELF or SHELF. There is very little information on SHELF today. The idea seems to have been to install an ELF antenna in deep underground shafts (potentially over a mile below the surface) using hard-rock mining techniques. Work on SHELF apparently continued through the 1970s, but it probably never got beyond the feasibility stage.
Instead, the Navy shifted its focus to Project Seafarer. Seafarer was clearly a direct descendant of Sanguine, but addressed many of its biggest problems through a stripped down design. Seafarer transmitters, for example, would be located in surface buildings instead of underground. Still, the same basic antenna design remained, a grid on 3.5 mile spacing requiring about 4,700 square miles. The Nevada Test Site was considered as a location, as was White Sands Missile Range and forestland in the Upper Peninsula of Michigan. Michigan was ultimately selected, a result of favorable ground conditions and the lack of frequent large explosions. Seafarer construction was expected to begin in 1977, but instead it ended. The governor of Michigan shot the idea down, Congress didn't like it all that much, and President Carter signed the order ending work on not only Seafarer but ELF in general. In 1977, after roughly two decades of R&D work across multiple experimental sites, the ELF Program was in mothballs.
The Navy was not so easily dissuaded. Later in 1977, they proposed "Austere ELF," a plan to throw together an ELF transmit site more or less from spare parts. A transmitter at Sawyer AFB in Michigan's Upper Peninsula would feed 32, 45, and 53-mile-long antenna elements, and via a leased telephone line the AFB would also control the inactive Wisconsin Test Facility transmitter. Even this basic, partially spare parts plan fell afoul of the public and congress. It failed to address most of the original health and environmental concerns, and still cost too much.
Serious resumption of the ELF program would have to wait for President Ronald Reagan. Reagan was a fan of big, expensive, technically sophisticated solutions to Cold War programs, and ELF sure was one of those. Reagan approved "Project ELF," itself a scaled down version of Austere ELF. Project ELF used the existing Wisconsin Test Facility, supplemented by an identical 56-mile antenna in Michigan's Escanaba State Forest. Both would be operated by Sawyer AFB.
The Wisconsin Test Facility from Project Sanguine, after 20 years, came to be known as Navy Radio Transmitter Clam Lake: the first operational ELF transmitter. The Michigan site, known as Navy Radio Transmitter Republic, quickly joined it.
It's amusing that a temporary test facility ultimately became the final product, but the Navy had already invested a huge amount of effort in the Wisconsin transmitter. Everything from the strength of the EM field produced by the transmitter to its location in a National Forest had posed complications.
Although Sanguine was intended as a hardened, underground system, burying antennas was a lot of work and the Wisconsin Test Facility had originally been temporary. Instead of buried cables, it used 1/2" aluminum wires strung above ground on utility poles for the two antennas. The voltages on the antenna wires required isolation from the surrounding environment, so as with power lines, trees were cleared to make a right of way for the antenna cables. The Forest Service, concerned about aesthetic impact on the forest's recreational areas, required that the antenna routes avoid some parts of the forest and take right-angle jogs near roads so that it was not possible to see a considerable distance down the antenna ROW when driving past (which would make the existence of the cleared ROW much more obvious). The transmitter site and antenna ROWs are still clearly visible today. At each of the four ends, about seven miles from the transmitter building, around 10,000 feet of buried copper wire make up the electrode.
Trickier were the electrical problems. The ELF antennas could induce a significant potential in parallel electrical lines, and the use of ground return meant a lot of interference on telephone lines. When transmitting, which was ultimately the case 24/7, the 2.6 MW transmitter induced a current of about 300 A in the cables and ground. Understanding these impacts of ELF transmitters was actually one of the original purposes of the Wisconsin Test Facility, and the Navy had built model power and telephone lines parallel to the antenna elements. The ELF system was found to cause problems ranging from flickering light bulbs to phantom telephone ringing, and the Navy installed additional grounding and filtering on public utilities throughout the area at its own expense—even reimbursing the utilities for administrative costs related to customer complaints. Still, the interference problems were not fully solved during the test operations and no doubt contributed to the public's less than enthusiastic support.
The former Wisconsin Test Facility, as Clam Lake, became operational in 1985. Its sister site, Republic in Michigan, went online in 1980. Republic was new construction, not an old experimental facility, but for cost and expediency reasons it was a virtually identical design to Clam Lake with above ground wires to buried electrode screens. Because of geographical constraints, the Republic antenna is not in a straightforward cross configuration. Instead, it's more of an "F" shape, electrically equivalent but with the feedlines placed differently. From 1989 on, the two sites operated in synchronization, with their total 2.6 MW operational transmitter power producing a radiated power of about eight watts.
Yes, even at 14 miles in length, ELF ground dipoles are extremely inefficient. This remained a key problem with ELF. Early Navy ELF plans, like Project Sanguine, had assumed the use of extremely high transmit powers to produce a usable signal. ELF propagates very well, but at the paltry 8 W achieved by the Project ELF transmitters, practical reception still required extracting the transmitted signal from a noise floor that was just about as loud. That meant reducing the practical bandwidth of the system even further, and thus its speed. Project Sanguine would probably have been able to transmit EAMs directly to submarines; Project ELF was not. Even the compact format of EAMs was too long for a system with an effective symbol rate of about one letter per five minutes, or fifteen minutes to transmit the three-letter code groups used by the Navy.
This reduced ELF capability was basically a very fancy pager network. The Navy has not disclosed the details of the scheme, but it's probably something like this: each submarine has a three-letter code group assigned to it. When its ELF receiver detects that specific code group, the submarine crew know that there is a message waiting for them, and they have to move at least close enough to the surface for VLF in order to find out what that message is. The Navy often referred to this as "bell ringing:" ELF messages were like the ringing of a telephone. As a means of supervision, so that submarines knew they were capable of receiving a message, "idle" code groups were transmitted 24/7.
For how hard the Navy had fought to build it, Project ELF did not have a long life. The Navy's ELF submarine communications system was conceived around 1958, became operational over 30 years later in 1989, and shut down in 2004 after just 15 years of service. "The Nuclear Register," an anti-nuclear-weapons newsletter, put it like this: "A surprise Navy announcement signaled the end of 36 years of first local, then global, opposition to the Navy's giant transmitter system."
ELF overcame formidable political odds. Besides Congress's lack of interest in the expense and federal policy concerns around health and the environment, a statewide ballot referendum in Michigan had attempted to prohibit construction and legislation prohibiting ELF transmitters was perennially introduced in the federal congress. Activist groups opposed to the transmitters staged regular demonstrations and, as Project ELF proceeded despite their objections, protests gave way to civil disobedience. Utility poles supporting the ELF cables were cut on numerous occasions, and the transmitter buildings vandalized. "The Nuclear Register" wrote:
Nukewatch said the Navy's closure announcement, while welcome, raises more questions than it answers. The Navy said "improved technologies" and "changing requirements of today's Navy" made ELF obsolete. However, "very-low-frequency (ELF) [sic] alternatives to ELF have been around for 30 years and the 'changing requirements' refer to the end of the cold war that happened 14 years ago," LaForge said.
Indeed, it is hard for me to see the undignified closure of the Navy's ELF program as anything other than an admission of failure. The basic technical concept of ELF appears sound, but the transmitters are large, disruptive, and costly to operate. It is not clear that the advantages of ELF, namely the greater depth at which it can be received, outweigh its downsides or compare well to VLF.
VLF is still used by the US Navy today. ELF is not: the US has had no ELF capability since the 2004 closure of Clam Lake and Republic. China, India, and Russia are the only other nations to have constructed ELF transmitters. The Russian system, ZEVS, operates at 82 Hz from ground dipole antenna in place since at least the early 1990s. It is a candidate for the most powerful radio transmitter in the world, although the exact specifications have not been made public. India's INS Kattabomman gained an ELF transmitter in the 2010s, and while few details are known, China is believed to have constructed an enormous ELF transmitter in Huazhong during the 2010s.
It is, of course, interesting that China and India have both built an ELF capability after the US abandoned the technology. One wonders what made an ELF capability so hard to sustain here, even after the Clam Lake and Republic sites were built. Well, there is an inertia to politics: the organized opposition to ELF, once energized, didn't go away. Area residents and politicians continued to organize for the closure of the Wisconsin and Michigan transmitters until their final days.
Opponents of the ELF sites got plenty of help from both science and popular culture. Preliminary research linking ELF radiation to leukemia has not held up to modern scrutiny, but as with broader EM/RF cancer links this is an area of ongoing controversy. Extensive research by the Navy, mostly on the Clam Lake Site, hasn't found evidence of ecological disruption due to the ELF transmitter. Still, there is ongoing controversy, and one of the reasons for Project ELF's long and tortuous construction process was a series of lawsuits and appeals under the National Environmental Policy Act, contesting the thoroughness of the environmental research.
As usual, these possible connections to health and environmental impacts have given way to conspiracy theories. In the more shadowy corners of the internet, ELF is associated with everything from strange sensations to mind control. And that is where I first became involved.
The X-Files episode "Drive" (S06E02) sees Fox Mulder cornered, practically carjacked, by a man who insists that if he does not drive West then his head will explode. The episode aired four years after the release of Speed and no doubt owes inspiration to that film (Mulder even makes a joke about it in the episode), but it attributes the bizarre scenario to a very different cause. The hapless victim, portrayed by Bryan Cranston, gained his head-exploding illness as a result of some sort of military experiment involving long antennas secretly buried beneath his house. Vince Gilligan wrote the episode, and while there were several influences, the final episode is a direct reference to Project ELF and the surrounding controversy. Years later, because of their collaboration on "Drive," Vince Gilligan cast Cranston as the lead in his show Breaking Bad.
In the episode, Cranston doesn't make it to the West Coast. Mulder and Scully hatch a plan to puncture his inner ear and relieve the pressure building in his brain somewhere on the California coast, but Mulder just can't drive fast enough. Cranston's head explodes.

Over the lifespan of the Project ELF facilities, police issued 636 trespass citations to demonstrators. Congressional representatives introduced legislation and amendments to end the ELF program multiple times. At least a half dozen ELF transmitter concepts were canceled, each one less ambitious than the ones before it. ELF is an interesting technology, but in a way, it's more interesting as a case study in military acquisition.
Take a concept that is expensive, politically unpopular, and questionably superior to systems already in service—but if the military wants it, they tend to eventually get it. After thirty years, the military wears resistance down and gets something pushed through. Fifteen years later, the Navy shrugs, calls it obsolete, and shuts it down. What's left is a 14-mile-across "X" in the forests of Wisconsin, a legacy of controversy that still echoes, and a pretty good episode of The X-Files.
-
Unrelated to astronomer Percival Lowell, although there are enough moments of intersection between the two that you wonder if they might have met.↩
-
Many of the fine details of the original NSS installation have become confused, probably because the Navy upgraded the equipment several times in its first decades and the specifications of different eras have become confused. Here are some notes: some sources give the transmitters as 350 kW, others as 500 kW. A Navy history explains that improvements to the antenna design allowed for raising the power after the site was originally designed, so 500 kW is indeed what was installed but we know where the 350 number came from. Some sources give the original towers as 500' tall and others (including Wikipedia) 600', I think the 500' number is more reliable as it agrees with the Navy history. I am not quite sure where the confusion comes from, though.↩
-
Some sources, such as Wikipedia, give a number of 22,500 square miles and 2/5 the area of Wisconsin. This was the very top end of a preliminary estimate that was revised down to 6,500 during planning. The 22,500 number still frequently appears, probably just because it's the more absurd figure, which is an example of the challenges of historical research when most information comes from activist groups opposed to the thing you're researching. Of course, we have to temper that criticism with the fact that some anti-Sanguine sources use the 6,500 figure, especially older ones. The shift towards the more attention-grabbing 22,500 might have happened later as Sanguine was discussed more by people without original knowledge of the program.↩
Christophe Pettus: What a Data Lake Actually Is (and why you probably don’t need one)
www @ Savannah: Malware in Proprietary Software - Latest Additions
The initial injustice of proprietary software often leads to further injustices: malicious functionalities.
The introduction of unjust techniques in nonfree software, such as back doors, DRM, tethering, and others, has become ever more frequent. Nowadays, it is standard practice.
We at the GNU Project show examples of malware that has been introduced in a wide variety of products and dis-services people use everyday, and of companies that make use of these techniques.
Here are our latest additions
April 2026
- Amazon is disconnecting the early models of the Swindle from the Amazon DRM-afflicted book store.
- Some models of Vizio “smart” TVs will have some of their functionalities locked behind a Walmart account login.
Eden: NHS goes to war against open source
Terence Eden reports that the UK's National Health Service (NHS) is preparing to close almost all of its open-source repositories as a response to LLM tools, such as Anthropic's Mythos, becoming more sophisticated at finding security vulnerabilities. He does not, to put it mildly, agree with the decision:
The majority of code repos published by the NHS are not meaningfully affected by any advance in security scanning. They're mostly data sets, internal tools, guidance, research tools, front-end design and the like. There is nothing in them which could realistically lead to a security incident.
When I was working at NHSX during the pandemic, we were so confident of the safety and necessity of open source, we made sure the Covid Contact Tracing app was open sourced the minute it was available to the public. That was a nationally mandated app, installed on millions of phones, subject to intense scrutiny from hostile powers - and yet, despite publishing the code, architecture and documentation, the open source code caused zero security incidents.
Furthermore, this new guidance is in direct contradiction to the UK's Tech Code of Practice point 3 "Be open and use open source" which insists on code being open.
health @ Savannah: GNU Health featured at the Cyber|Show UK
GNU Health at the Cyber|Show!
Grab a coffee and listen to the 40 min. interview Andy Farnell and Helen Plews made to Luis Falcón in their wonderful show. ❤️
They covered key aspects on citizen and patient data privacy, hospital management, federated health networks, genomics and wearables. In the interview they also talked about the risks associated to commercial, closed sourced electronic health records systems and proprietary mobile applications.
The interview reveals how crucial is Free/Libre software for equity and digital sovereignty in our societies. 🩺 🏥 🧬 👇️
https://cybershow ... pisodes.php?id=64
About Cyber|Show:
https://cybers ... w.uk/about.php
Get this and latest news about GNU Health from our official Mastodon account:
https://mastodon. ... social/@gnuhealth
Tags: #GNUHealth #GNU #OpenScience #PublicHealth #Privacy #FreeSoftware #SocialMedicine #CyberShow
Bruce Momjian: New Presentation
I just gave a new presentation at PGDay Armenia titled Building an MCP Server Using Postgres. The talk is a follow-up to my Databases in the AI Trenches talk, and explores how MCP allows functionality beyond LLMs and RAG alone. It includes MCP demos of a radiation detector and pretzel bakery.
Victory after a decade preventing Radio Lockdown
Victory after a decade preventing Radio Lockdown
The European Commission is choosing to protect user’s right to install any software on their radio devices by deciding to abandon the specific article in the EU Radio Equipment regulation that was harming software freedom.

From 2014 onwards an specific article on the EU regulation Radio Equipment Directive (RED) threatened to make it impossible to install custom software on most radio devices like WiFi routers, mobile phones, Bluetooth chips in computers, GPS receivers, and embedded devices. It would have required hardware manufacturers to prevent users from installing any software not certified by them.
After more than 10 years of persistent steady work by the FSFE and a broad coalition of organisations, the European Commission decided in January 2026 to abandon this provision: Free Software on radio devices remains protected!
This decision followed an impact assessment study commissioned by DG GROWEC’s DG GROW (Directorate-General for Internal Market, Industry, Entrepreneurship and SMEs), published in December 2025. The study evaluated five policy options and concluded that the risks associated with software reconfiguration of radio devices "remain theoretical and have not materialised in a systemic manner". It recommended a soft law approach based on voluntary guidance and best practices, rather than binding technical restrictions. Activating Article 3(3)(i) was found to severely harm Free Software, innovation, and user rights, while imposing prohibitive costs on small and medium-sized enterprises.
Notably, the impact assessment cited the legal study by Dr. Till Jaeger, commissioned by the FSFE, which demonstrated that Article 3(3)(i) is incompatible with widely used Free Software licences such as the GNU GPL. The FSFE and the concerns raised by the Free Software community were explicitly referenced as reasons against activation.
This outcome is the result of more than decade of sustained work with intense phases, but also phases of waiting for the right moment to get active again. Since 2015, the FSFE has been monitoring the regulatory process, contributing expertise to consultations, publishing analyses, and a broad coalition of organisations and individuals who raised their voices against Radio Lockdown. It demonstrates that persistent, evidence-based engagement with EU policy processes can make a real difference for software freedom.
This success would not have been possible without the many people and organisations who took action over the years. Thank you to everyone who contacted the European Commission and political representatives, who raised awareness about Radio Lockdown, who participated in public consultations, who signed the Joint Statement against Radio Lockdown, and all the FSFE supporters for their financial contributions enabling our work. Your engagement made a real difference.
However, the underlying idea of shifting compliance responsibility to manufacturers — and thereby restricting which software can run on devices — may resurface in other regulatory contexts.
So while the immediate threat of Article 3(3)(i) has been averted, the idea of restricting software on radio devices could resurface in other regulations.
Ensure that software freedom remains protected:
- Stay informed about EU policy developments affecting Free Software via the FSFE's news channels and newsletter.
- Support the FSFE's ongoing work for Device Neutrality, which safeguards users' rights to choose the software running on their devices.
- Help us continue our work by becoming a supporter of the FSFE. Sustained engagement with EU policy processes requires independent resources.
It often takes a long breath, patience, the expertise to spot the right time for action, and the resources to then actually act. With your help the FSFE will continue to defend the right of users to install or remove any software on any of their devices.
Become a supporter today!
Jan Wieremjewicz: Open source doesn’t die. It gets unfunded.
If you are using PostgreSQL in any capacity very likely this week has started for you with a bang. pgBackRest, one of the most known tools for PostgreSQL, praised for the scalable and reliable way to do backups has announced that the project is currently archived.
Archived, you mean EOL?

No! Open source software rarely has a hard “end of life.” What it does have are maintainership gaps and those can be just as serious.
Christophe Pettus: AIO Grows Up
Dave Stokes: PostgreSQL, Timezones, and DBeaver
Time zones are an unfortunately complex subject when dealing with PostgreSQL. You may be running your local time zone on your on-premises server or on your own laptop. Or you may be using the time zone of your server’s physical location. And you may have set all your servers to UTC. And all are valid approaches, depending on your circumstances.
DBeaver users know it is a very advanced tool for database work. But it is easy to get into time zone issues, as the default time zone for your session is taken from your client machine. But this can be adjusted.
UTC?
UTC or Universal Time Coordinated is a time zone standard used as a basis for all time zones worldwide. It is a constant time scale and does not change for Daylight Saving Time (DST). The benefits include streamlined cross-region data synchronization, easier debugging, accurate time-based transaction ordering, and scalability for global applications. In distributed systems, storing data in UTC ensures that logs and transactions are ordered correctly, regardless of which geographic region recorded the action. There are many more reasons to use UTC, which will be ignored for brevity.
PostgreSQL knows how to adjust for situations like where you are in a North American time zone, and the cluster of servers is in EMEA. Most client programs, when sending a timestamp, get it converted by the PostgreSQL server. But sometimes that does not happen, and you need to make adjustments.
Checking Your Time Zone
You can check your time zone in PostgreSQL with the SHOW timezone; command. In my case, I am in Texas, which is in the America/Chicago time zone.
![]() |
| The SHOW timezone; command and its output |
You can see the time zone offset at the end of the output from select now();
![]() |
| Here we see the date, the time, and the offset from the current time and UTC. |
In this case, on my laptop’s local instance, I am using the local time
zone. Which is now -5 hours away from UTC.
This is the session timezone, set by the client's connection information. I can double-check that by querying SELECT * FROM pg_settings WHERE name = 'TimeZone';
![]() |
| Here we see the connections, the time zone, and that the server is getting the time zone information from the connection. |
The Problem
What if your server is running in a different time zone, such as UTC? We will assume the server is performing a large number of transitions and is replicated to other servers in other time zones. But when you connect to that server, your session will report the time zone from your system, which is probably not in UTC. You want UTC!
What if you have a situation where you would want to use UTC everywhere?
Why is UTC better in your scenario?
Consistency across replicas and time zones: All servers (primary and replicas) see the exact same absolute timestamps for events, regardless of their local OS time zone or continent. This avoids subtle bugs during replication, failover, or cross-server queries.
High-transaction reliability: timestamptz (timestamp with time zone) internally stores everything as UTC anyway. Using UTC as the session/server default eliminates ambiguity when inserting or reading times without explicit offsets.
Avoids DST and timezone rule changes: Local time zones can shift due to daylight saving time or political changes. UTC never does, making historical data and comparisons reliable.
Replicated environments: Different replicas in other time zones could interpret or display the same data differently if they rely on local server time. UTC removes that risk.
Key rule: Use the timestamptz (or timestamp with time zone) data type for all timestamp columns that represent moments in time (e.g., created_at, transaction_time, audit logs). Avoid plain timestamp (without time zone) unless you truly need a "local wall-clock time" without timezone context (rare for transactional systems).
Store in UTC, convert to user-local time only in the application/presentation layer (e.g., your app code, reports, or UI). This is the consensus best practice for distributed/global systems.
JDBC (Java): Add ?timezone=UTC or similar to the connection URL, or execute the SET statement.
DBeaver and Setting Time Zones
Option 1 - Default to UTC Always
DBeaver (which uses the PostgreSQL JDBC driver) often defaults to your local machine's timezone (Austin, Texas = likely America/Chicago or similar). This can cause timestamptz values to be interpreted or displayed incorrectly compared to the server. There is no automatic "use server's timezone" option because the JDBC driver cannot reliably query the server's default timezone setting at connect time.
You can set the timezone in Windows →Preferences →User Interface →Timezone and select UTC.
![]() |
| DBeaver's options for time zones |
UTC is near the bottom. Restart DBeaver or reconnect to apply the change.
![]() |
| +0000 is UTC |
Option 2 - On A Per-connection Basis
Maybe you only want to make sure specific servers use UTC. Right-click on that connection, select Edit Connection. Select the Driver Property tab.
![]() |
| The Driver Property tab under Edit Connection |
Select the Green Plus sign
![]() |
| You can add the setting under User Properties |
Now add serverTimezone and set the value to UTC.
How to Verify it Worked
After reconnecting, run these in a SQL editor:
SHOW timezone; -- Should show 'UTC' (or whatever you set)
SELECT now(); -- Should return time in the session timezone
SELECT now() AT TIME ZONE 'UTC';
SELECT current_setting('timezone');
Compare the output to what you get in psql on the server.
Summary
Five Things Science Cannot Explain (but Theism Can)
In the past few centuries, science has exposed the underlying forces that allow the world to function — forces that had been hidden from humanity. There is a sense that science will continue to expose forces, going deeper and deeper. However, in recent decades, the limits of what science can explain have been exposed, and this article highlights those. Science cannot explain the
- Origin of the universe
- Origin of the fundamental laws of nature
- Fine-tuning of the Universe
- Origin of consciousness
- Existence of moral, rational, and aesthetic objective laws and intrinsically valuable properties
No matter how hard scientific researchers try, these will remain elusive, promoting the acceptance of forces beyond scientific explanation that we will never fully understand.
Post a CommentFSF Blogs: Free software offers trust and privacy; Ring offers mass surveillance
Vibhor Kumar: The Real Shift in Data Platforms Is Not Just AI. It Is Fewer Seams.

Introduction: The Market Is Talking About AI, but the Deeper Change Is Architectural
The database market is full of confident declarations right now. One vendor says the cloud data warehouse era is ending. Another argues that AI is redrawing the database landscape. A third claims that real-time analytics is now the center of gravity. Each story contains some truth, and each vendor naturally presents itself as the answer.
But there is a risk in taking these narratives too literally. The deeper shift in enterprise data platforms is not simply that AI is changing databases. It is that modern platforms are being forced to reduce the seams between systems. That is the more important architectural story, and it is the one that will matter long after today’s product positioning slides have been replaced by tomorrow’s.
For years, enterprises tolerated fragmented data architectures because the fragmentation felt manageable. One system handled transactions. Another handled analytics. A streaming layer was added for movement and enrichment. Dashboards sat elsewhere. Then machine learning appeared, followed by vector stores, feature stores, observability engines, and lakehouse layers. For a while, the industry treated this as normal evolution. Eventually, however, many teams discovered that they were not building a platform so much as negotiating peace between products.
That is why this moment matters. AI may be accelerating the conversation, but the real pressure is architectural. Enterprises are trying to simplify how data flows, how systems interact, and how teams operate. In other words, they are trying to remove seams.
The Problem: The Cost of Separation Has Become Too High
The old world was built around separation. In one sense, that separation was rational. Different workloads genuinely do have different requirements. Transactions need integrity and predictability. Analytics often need scale and throughput. Observability workloads have different ingestion and retention patterns. AI experimentation has yet another set of needs. It was never realistic to assume that one engine would elegantly solve every problem.
The problem is that every additional boundary also introduces friction. Every seam means more pipelines, more copied data, more latency, more governance overhead, more operational burden, and more confusion about where truth actually lives. The question is no longer whether each component in the architecture is individually useful. The question is whether the full arrangement is coherent enough to operate without creating constant drag.
Anyone who has spent time inside a PostgreSQL-heavy estate has seen this pattern clearly. PostgreSQL becomes the trusted system of record. Native logical replication is added to publish selected table changes, or CDC pipelines are introduced to feed analytical and operational consumers. Monitoring, governance, and workflow tools then accumulate around those flows. None of these components are inherently wrong. In fact, many are very useful. The issue is cumulative complexity. PostgreSQL supports logical replication directly, and the ecosystem has rich CDC tooling, but those capabilities still come with restrictions, state management, and operational decisions that can quietly multiply seams if they are not used carefully.
AI has sharpened this problem. Teams are less willing to accept stale dashboards, long batch windows, incomplete telemetry, fragmented access paths, or slow experimentation cycles. They want conversational analytics, AI-assisted operations, near-real-time responses, and production-like testing environments. Those expectations are pushing against architectural seams that were already creaking.
AI Is an Accelerator, Not the Only Cause
A great deal of current market discussion treats this as an AI-led shift. That interpretation is understandable, but incomplete. AI did not create the desire for lower-latency analytics, higher concurrency, fresher operational signals, open data access, or safer experimentation. Those needs were already present. AI simply made them impossible to postpone.
Once users expect to ask natural-language questions across business data, logs, metrics, and events, the underlying platform must become tighter, simpler, and more responsive. The old model of exporting data, transforming it, landing it elsewhere, waiting for the next refresh, and then asking a model to interpret it starts to look less like architecture and more like an elaborate apology. AI has not replaced the older pressures on data platforms. It has amplified them and exposed their consequences.
That is why the real architectural demand is not merely “become AI-ready.” It is more fundamental than that. Reduce unnecessary data movement. Clarify where truth lives. Simplify the path from transaction to analysis to experimentation. Build systems that can evolve without becoming a tax on the teams that run them.
Architecture Direction: Workload-Aware Platforms With Fewer Seams
One of the healthiest shifts in the market is that teams are finally acknowledging an obvious truth: not all data workloads are the same, and pretending otherwise usually produces more pain than elegance. Transactional integrity, distributed availability, real-time analytics, observability, object-storage-backed data sharing, and AI experimentation all place different demands on infrastructure.
At the same time, the answer cannot be uncontrolled tool sprawl. Replacing one giant monolith with half a dozen loosely governed subsystems is not modernization. It is simply a more fashionable version of fragmentation.
The right answer is neither “one platform does everything” nor “every workload gets its own product.” The more mature answer is workload-aware architecture with fewer seams. That means being explicit about where transactional truth lives, where analytical serving belongs, where open storage formats are useful, where data movement is justified, where it is wasteful, and how teams can experiment safely without destabilizing production.
This is where many platform conversations still go off course. They focus on what each engine can do but not on how much operational pain the architecture creates around it. Capabilities matter, of course, but operating model matters more. Enterprises do not merely buy performance. They buy simplicity, operational trust, resilience, governance, speed of change, and the ability to evolve without rebuilding the estate every few years.
That is why the most strategic question is not “Which engine is fastest?” It is “What platform shape can customers actually live with?”
A PostgreSQL-Centered View of Platform Evolution
This is where a Postgres-first platform story becomes strategically powerful. PostgreSQL has long been trusted as a system of record for transactional applications, and that remains one of its greatest strengths. But the future platform story is no longer just about OLTP. It is about how a PostgreSQL-centered architecture can extend from trusted operational state into replication-driven data sharing, analytics-oriented serving paths, observability, governance, and AI-adjacent workflows without forcing customers into a patchwork of unrelated systems from the start.
In practical terms, this is not abstract theory. It already reflects how real PostgreSQL estates evolve. PostgreSQL often remains the source of truth for operational data. Native logical replication provides publication/subscription-based change flow for selected tables, and CDC frameworks such as Debezium typically build on PostgreSQL logical decoding to push changes into streaming and downstream analytical systems. Foreign data wrappers make it possible to query remote systems through PostgreSQL, with postgres_fdw being the built-in example for external PostgreSQL servers. Declarative partitioning remains a core tool for large-table management and data lifecycle strategy. On the AI side, ecosystem extensions such as pgvector make vector similarity search possible inside PostgreSQL itself. On the operational side, views such as pg_stat_activity and extensions such as pg_stat_statements are foundational to PostgreSQL observability, while row-level security and pgAudit provide concrete governance mechanisms.
Distributed PostgreSQL patterns also exist, but it is important to describe them carefully. PostgreSQL core supports replication-based topologies, and the ecosystem includes distributed and sharding approaches, but there is not one single native core model that transparently turns PostgreSQL into a distributed database for every use case. That distinction matters because it keeps the platform story honest.
The PostgreSQL ecosystem is strong precisely because it offers so many paths forward. But that richness brings responsibility. Every additional component still has to fit into an operating model that a team can support. The real question is not whether PostgreSQL can participate in modern platform design. It clearly can. The more important question is whether we are designing PostgreSQL-centered platforms in a way that reduces seams rather than quietly multiplying them.
That distinction matters. It is the difference between a product portfolio and a platform.
A Concrete Example: What “Fewer Seams” Looks Like in Practice
It helps to make this discussion tangible. Consider a realistic enterprise pattern.
PostgreSQL serves as the system of record for customer transactions, account state, and policy-controlled operational data. Native logical replication or CDC publishes selected changes into a real-time analytical path that supports dashboards, fraud monitoring, support workflows, or AI-assisted investigation. The analytical path may be fed directly from publication/subscription flows or through logical-decoding-based CDC tooling, depending on the freshness and ecosystem requirements. Older and less frequently accessed data may then be written to object storage or other external analytical layers through ETL, ELT, or CDC-oriented pipelines rather than through a built-in PostgreSQL lakehouse feature. Observability captures database health, query behavior, replication lag, and pipeline status through PostgreSQL statistics views and adjacent tooling. Governance uses role design, row-level security, auditing, and pipeline controls to define what data can move, who can access it, and where experimentation is permitted.
A cloned or branched environment can then give teams a safe place to test schema changes, validate upgrades, run feature engineering pipelines, evaluate models, or experiment with prompts and retrieval workflows. Here too, precision matters: branching is not a PostgreSQL core feature. In the PostgreSQL ecosystem, it is more often delivered by platform implementations or snapshot/cloning approaches, with Neon being one visible example of a branching-oriented model.
There is nothing exotic about this pattern. Many teams are already building versions of it. The difference between a healthy implementation and a fragile one is not the existence of the components themselves. It is whether the transitions between them feel natural, governed, and operationally manageable. That is what fewer seams means in practical terms. It does not mean pretending all workloads are the same. It means reducing the friction between the workloads that are different.
Why Data Branching Deserves More Attention
One of the most under-discussed capabilities in current platform conversations is data branching. Most market discussion still revolves around transactions, analytics, streaming, AI, and storage formats. Far less attention is given to how developers, analysts, data scientists, and AI engineers actually work day to day.
Those teams need isolated environments, production-like datasets, safe testing, fast rollback, reproducible experiments, and controlled validation of schema, policy, or pipeline changes. Without branching or a strong equivalent, teams often solve this awkwardly. They duplicate environments by hand, copy datasets in ad hoc ways, test against stale clones, risk touching shared systems, or avoid testing as thoroughly as they should because setup is too painful.
That is not just inefficient. It slows down innovation and increases operational risk.
In a PostgreSQL-centered world, this becomes especially important because PostgreSQL often contains the most trusted operational data in the estate. Safe cloning, snapshotting, or branching-style workflows become valuable not only for application development, but also for upgrade validation, analytics testing, security review, and AI experimentation. But that capability should be described honestly: it is largely an ecosystem and platform-layer concern, not something PostgreSQL core exposes as a native “branch this database” primitive.
As AI and advanced analytics become more embedded in enterprise workflows, branching stops being a convenience feature and becomes part of the platform’s ability to support rapid, governed iteration.
Serverless Postgres Is Valuable, but It Is Not the Full Story
Another important theme in the market is serverless Postgres. The appeal is real. Serverless models can improve developer onboarding, reduce idle costs, support bursty workloads, simplify provisioning, and make experimentation easier. For many use cases, that is meaningful progress.
But it is important not to confuse a delivery model with a complete architectural destination. Serverless Postgres addresses convenience and elasticity. It does not automatically address enterprise-grade availability, globally distributed write patterns, regulated deployment constraints, topology control, performance predictability, platform-level governance, or integrated analytical and AI workflows.
That does not make serverless unimportant. On the contrary, it will continue to matter. It lowers the barrier to entry and fits many modern application patterns well. It also aligns naturally with one part of the broader platform story: faster environment creation and easier experimentation. But enterprise strategy is larger than provisioning style. The real question remains how transactional systems, analytics, experimentation, governance, and AI workloads come together in a way that is both powerful and operable.
That question is bigger than serverless alone, and leaders should resist pretending otherwise.
The Market Perspective: What the Industry Is Really Telling Us
If we step back from individual vendor narratives, the market seems to be saying a few clear things. Customers still want trusted transactional systems. They increasingly need real-time and near-real-time analytical access. They do not want to pay an integration penalty every time a new use case appears. AI is making freshness, experimentation, and observability more central to platform design. And perhaps most importantly, teams are tired of architectures that look impressive in diagrams but expensive in real life.
That is why I believe the next era of data platforms will be shaped by a simple principle: keep specialization where it adds real value, and eliminate seams everywhere else.
This is not an argument against multiple engines, open formats, or analytics specialization. It is an argument against unnecessary architectural tax. If a specialized component clearly improves outcomes, simplifies operations, or enables use cases that genuinely matter, it is worth having. But if the architecture accumulates copies, handoffs, and dependencies that exist only because the platform was assembled one product at a time, then the burden eventually exceeds the benefit.
The market is not merely rewarding better performance. It is rewarding architectures that are easier to reason about.
Strategic Implications for Technology Leaders
For technology leaders, the most important question is no longer whether the market is changing. It is whether their platform response is centered on the right priorities.
Those priorities should include trusted operational state, resilience where needed, analytics acceleration without unnecessary duplication, safe experimentation through controlled cloning or branching-style workflows, and simpler paths from data to decision to AI. But these principles only matter if they show up in operational choices.
That means reducing redundant data movement instead of normalizing endless copies. It means standardizing source-of-truth boundaries so teams know where truth lives. It means investing in safe experimentation environments rather than allowing informal clones and shadow systems to proliferate. It means simplifying analytics and AI data paths wherever possible. And it means evaluating platform choices not only on raw performance, but also on operational burden, governance impact, and long-term maintainability.
This is the kind of thinking that matters to architects, CTOs, and CIOs alike. Performance remains important, but it is not enough. The strategic prize is a platform that moves fast without becoming fragile, supports innovation without multiplying risk, and evolves without forcing repeated architectural reset.
Conclusion: The Future Is Not Just AI-Ready. It Is Seam-Aware.
The database market loves grand narratives. Warehouses are dead. AI changes everything. This engine wins. That architecture loses. The real world is usually less theatrical and more practical.
The winners in the next phase of the platform market will not be the ones that shout the loudest about replacing everything. They will be the ones that help enterprises simplify what has become too fragmented, accelerate what has become too slow, and experiment safely without multiplying operational risk.
That is why I believe the future is not just AI-ready. It is seam-aware.
The best platforms will be the ones that know where specialization genuinely helps, where PostgreSQL-centered architecture can remain the anchor, where replication and CDC are worth their cost, where observability and governance are first-class concerns, and where architectural seams should simply disappear. In the years ahead, that ability to reduce drag may matter more than any individual benchmark or marketing claim. And for many organizations, it will be the difference between a platform that merely exists and one that actually helps the business move.
Google details new 24-hour process to sideload unverified Android apps (Ars Technica)
Here are the steps:
- Enable developer options by tapping the software build number in About Phone seven times
- In Settings > System, open Developer Options and scroll down to "Allow Unverified Packages."
- Flip the toggle and tap to confirm you are not being coerced
- Enter device unlock code
- Restart your device
- Wait 24 hours
- Return to the unverified packages menu at the end of the security delay
- Scroll past additional warnings and select either "Allow temporarily" (seven days) or "Allow indefinitely."
- Check the box confirming you understand the risks.
- You can now install unverified packages on the device by tapping the "Install anyway" option in the package manager.
Submarines in Swedish Waters
I try to keep current on revelations in the spy world, but this one got by me, perhaps because it was more of a psychological operation (PSYOP) than a spy operation to collect secret information.
When Ronald Reagan became president in 1981, he took a more aggressive approach to relations with the Soviet Union, which eventually led to its collapse a decade later. The 2009 book Reagan's Secret War: The Untold Story of His Fight to Save the World from Nuclear Disaster outlines this policy.
In trying to put maximum pressure on the Soviet Union, Reagan wanted to use airfields in Sweden in the event of a war. However, Sweden's left-of-center prime minister Olof Palme wanted to steer a neutral course between NATO and the Soviet Union. Swedish popular opinion supported this approach, with only 6% perceiving the Soviet Union as a direct threat.
The Plastics Recycling Industry
Last year I wrote about the falsities of plastics recycling. This new video talks more specifically about the politics of the plastics industry, and states, "they didn't really need it [recycling] to work; they needed people to believe that it was working." This report has many more details.
View or Post CommentsBad Last Couple of Years
I read several news articles a day, and sometimes I read the article's comments. Fifty days ago I read an article about Venezuela, and one of the comments to that article by dFreeThinker made an impact on me:
It's been a really bad last couple of years for authoritarian dictators and terrorist groups around the world: Assad fleeing Syria; corrupt leftist governments getting voted out of power in Chile, Bolivia, Argentina, and Ecuador; terrorist groups like Hamas, Hezbollah, and the Houthis being severely weakened, if not completely decimated and losing financial and material support from Iran; the supreme leader of Iran in hiding because he fears assassination by the Mossad, his country getting racked by protests, and his nuclear weapons program getting obliterated; Russia getting bogged down in Ukraine along with low world oil prices that is hindering their war effort and economy; terrorist groups in Nigeria getting bombed; Venezuelan dictator Nicolas Maduro being captured and removed from power; Cuba losing a major economic lifeline (oil) from their main ally Venezuela; China largely losing their sphere of influence in Central and South America.It just goes to show that when governments don't focus in on improving the lives of their own citizens and instead establish kleptocracies where the leaders flourish and the average citizen is impoverished, eventually the people of these nations lose their patience and either vote them out or remove them by force. I hope every dictator in the world goes to bed at night terrified that they will be the next domino to fall!
Europe Is Delusional
When the Euro currency was created by the European Union in 1999, it promised a single currency that could unify an economic zone to rival the United States. In my trips to Europe in the early 2000's, Europeans waxed at how their new currency would challenge the United States for dominance.
How things have changed.
While the European Union added countries, and then
lost the United Kingdom, the European economic zone never came to rival
the United States. In fact the gap between the United States and European economies has gotten only larger since 1999
(report,
report) — this
quote says it all:
According to the World Bank, in the period 2008-2023, EU GDP grew by 13.5% (from $16.37 trillion to $18.59 trillion) while U.S. GDP rose by 87% (from $14.77 to $27.72 trillion). The UK's GDP increased by 15.4%. In 2023, EU GDP was 67% of U.S. GDP — down from 110% in 2008.
Vibhor Kumar: From Exit to Evolution

A Migration Framework That Turns PostgreSQL into a Modernization Engine
Three years ago, I sat in a conference room with a CIO who had just finished reviewing next year’s database renewal costs.
He closed the folder.
Looked up.
And said quietly:
“This is not a database problem. This is a dependency problem.”
That moment captures what most enterprises eventually realize.
They are not paying for technology.
They are paying for architectural gravity.
And that is where migration begins.
But here is the mistake I see repeatedly:
Organizations migrate to PostgreSQL…
and unknowingly carry their old architecture with them.
They change engines.
They keep the same vehicle.
Migration happened.
Modernization didn’t.
This article is about preventing that outcome.
The Two-Layer Reality: MIGRATE → MODERNIZE
Successful enterprise transformation requires two deliberate phases:
Layer 1: MIGRATE+
Engineering discipline. Zero surprises. Controlled execution.
Layer 2: MODERNIZE
Architectural redesign. Operational autonomy. Strategic leverage.
Most programs stop at Layer 1.
Leaders finish Layer 2.
Part I — MIGRATE+: The Discipline of Getting It Right
Let me share a short story.
A large financial institution once decided to “move quickly” off Oracle.
They converted schema.
Loaded data.
Switched applications.
Six weeks later, they discovered reconciliation mismatches in a reporting pipeline that had never been fully mapped.
The issue wasn’t PostgreSQL.
It was skipped discipline.
Migration at scale requires structure.
M — Map the Landscape
Before touching DDL, map everything:
- PL/SQL packages and procedural density
- Scheduler jobs
- DB links
- Cross-application dependencies
- Data classifications
- HA/DR expectations
- Regulatory overlays
You are not inventorying tables.
You are mapping institutional dependency.
One retail enterprise I worked with discovered that a single materialized view was feeding five downstream systems — none documented. That discovery changed their entire cutover strategy.
Mapping is not overhead.
It is risk containment.
I — Identify Application & Integration Friction
Oracle systems often assume:
- Implicit commits
- Optimizer hints
- Autonomous transactions
- Exception-handling patterns
One SaaS provider we advised had embedded optimizer hints across thousands of dynamic SQL statements. Their application performed beautifully in Oracle.
In PostgreSQL? It stalled — until the query logic was redesigned.
Compatibility scanning must include:
- SQL dialect analysis
- ORM behavior validation
- Isolation level testing
- Retry semantics
- Integration mapping (batch, APIs, streaming)
Modernization begins when you confront these assumptions intentionally.
G — Govern Security & Compliance from Day Zero
In regulated industries, governance is not optional.
I once worked with a payments organization that completed migration successfully — only to face an audit finding six months later because audit trails were not aligned with their previous Oracle implementation.
The lesson?
Compliance must be engineered, not assumed.
Design PostgreSQL with:
- Encryption at rest
- TLS enforcement
- Role-based access controls
- Row-level security where required
- Audit integration
- SIEM alignment
Trust is a design decision.
R — Replicate Schema & Data with Optimization
Schema conversion is not copy-paste.
It is translation — and sometimes transformation.
Blindly replicating Oracle partitioning or indexing strategies misses PostgreSQL-native strengths.
Data migration at scale is choreography:
- Initial bulk load
- Continuous change capture
- Validation checkpoints
- Cutover rehearsal
- Defined rollback triggers
A telecom migration we supported rehearsed cutover three times before the real event.
On the final weekend, execution took hours — not days.
Rehearsal reduces drama.
A — Assure Through Testing & Benchmarking
Testing is not “did it load?”
It is:
- Row count reconciliation
- Checksum validation
- Query plan analysis
- Peak load simulation
- Failover testing
One enterprise found that PostgreSQL outperformed Oracle in reporting workloads — but only after indexing was redesigned intentionally.
Performance is not automatic.
It is engineered.
T — Transform Applications & Execute Cutover
PL/SQL refactoring.
Hint removal.
Transaction adjustments.
Connection pooling redesign.
Cutover is not the end.
It is the inflection point.
After this moment, your organization will either:
- Run PostgreSQL like new Oracle
- Or treat it as a platform foundation
That choice determines modernization trajectory.
E — Establish Operational Excellence
The quiet failures happen post go-live.
Backup strategy.
HA configuration.
Observability discipline.
Upgrade planning.
An enterprise that lacks operational maturity has not modernized — it has relocated.
PostgreSQL must be operated with SRE discipline, not reactive firefighting.
The Pivot: Migration Is Necessary. Modernization Is Optional.
Most enterprises celebrate after go-live.
Licenses reduced.
Contracts renegotiated.
Stability achieved.
But here is the strategic question:
If your PostgreSQL architecture mirrors your Oracle architecture,
what actually changed?
Migration replaces technology.
Modernization redesigns capability.
Part II — MODERNIZE: The Platform Mindset
Now we elevate.
This is where PostgreSQL becomes strategic.
M — Modularize Architecture
Break monolith databases into domain-aligned services.
Separate transactional workloads from analytics acceleration.
Introduce streaming pipelines.
Reduce tight coupling.
Modernization reduces fragility.
O — Operationalize Autonomy
Automate failover.
Automate scaling.
Embed observability into engineering workflows.
The goal is simple:
The platform should need fewer emergency interventions over time.
D — Democratize Data
Enable analytics acceleration.
Integrate vector search where AI use cases demand it.
Expose governed APIs.
Reduce shadow IT analytics.
Modernization increases access — without increasing chaos.
E — Engineer for Elasticity
Multi-AZ resilience.
Geo-distribution when justified.
Cost-aware scaling.
Storage tiering.
Flexibility is leverage.
R — Reposition Organizational Capability
The biggest shift is not technical.
It is cultural.
Build internal PostgreSQL expertise.
Create a Center of Excellence.
Reduce proprietary skill silos.
Architectural sovereignty changes decision dynamics.
N — Navigate Continuous Evolution
Upgrade proactively.
Adopt new PostgreSQL capabilities.
Integrate AI-driven use cases.
Embed FinOps governance.
Modernization is not a milestone.
It is a discipline.
Migration vs Modernization
| Migration | Modernization |
|---|---|
| Replace vendor | Redesign architecture |
| Reduce license cost | Increase leverage |
| Match behavior | Improve performance |
| Stabilize | Evolve |
| Technical initiative | Enterprise transformation |
Final Reflection
The CIO in that boardroom was right.
It wasn’t a database problem.
It was a dependency problem.
PostgreSQL is not just a cost lever.
It is an architectural reset.
Migration is the doorway.
Modernization is the architecture you build once you walk through it.
The question is no longer:
“Can we move off Oracle?”
It is:
“Are we ready to redesign our digital core?”
That answer defines whether you simply exit — or truly evolve.
Dave Page: Teaching an LLM What It Doesn't Know About PostgreSQL
Large language models know a remarkable amount about PostgreSQL. They can write SQL, explain query plans, and discuss the finer points of MVCC with genuine competence. But there are hard limits to what any model can know, and when you're building tools that connect LLMs to real databases, those limits become apparent surprisingly quickly.The core issue is training data. Models learn from whatever was available at the time they were trained, and that corpus is frozen the moment training ends. PostgreSQL 17 might be well represented in a model's training data, but PostgreSQL 18 almost certainly isn't if the model was trained before the release. Extensions and tools from smaller companies are even worse off, because there simply isn't enough public documentation, blog posts, and Stack Overflow discussions for the model to have learned from. And products that were released after the training cutoff are invisible entirely.This is the problem we set out to solve with the knowledgebase system in the pgEdge Postgres MCP Server. Rather than hoping the LLM already knows what it needs, we give it a tool that lets it search curated, up-to-date documentation at query time and incorporate the results into its answers. It's RAG, in essence, but tightly integrated into the MCP tool workflow so the LLM can use it as naturally as it would run a SQL query.
Products the LLM has never heard of
To understand why this matters, consider a few of the products whose documentation we index.Spock is an open source PostgreSQL extension that provides asynchronous multi-master logical replication. It allows multiple PostgreSQL nodes to accept both reads and writes simultaneously, with automatic conflict resolution between nodes. It supports automatic DDL replication, configurable conflict resolution strategies, row filtering, column projection, and cross-version replication for zero-downtime upgrades. Spock grew out of earlier work on pgLogical and BDR2, but has been substantially enhanced since pgEdge first introduced it in 2023.If you ask an LLM about Spock without any supplementary context, you'll most likely get an answer about the Java testing framework of the same name, or at best a vague and outdated reference to the PostgreSQL extension. The model has no way of knowing about the current configuration syntax, the available conflict resolution modes, or how to set up a multi-node cluster with the latest release. The documentation simply wasn't in its training data, and for a niche product in a specialised corner of the PostgreSQL ecosystem, it never will be in sufficient detail.The pgEdge RAG Server is another example. It's a Go-based API server for Retrieval-Augmented Generation that uses PostgreSQL with pgvector as its backend, combining vector similarity search with BM25 text matching for hybrid retrieval. The entire product was announced in December 2025 as part of the pgEdge Agentic AI Toolkit, which means any model trained before that date knows nothing about it whatsoever.The same applies to other pgEdge components like the pgEdge Platform itself, which bundles standard PostgreSQL with Spock replication, the ACE consistency engine, Snowflake Sequences for globally unique IDs, and over twenty popular extensions into a self-managed distributed PostgreSQL distribution. Each of these products has its own documentation covering installation, configuration, and troubleshooting, and none of it is likely to appear in a model's training data with any reliability.Even PostgreSQL itself presents a moving target. The official documentation runs to thousands of pages and changes with every major release. A model trained on PostgreSQL 16 documentation will give subtly wrong answers about features that were added or changed in version 17 or 18, and it has no way of knowing that its information is out of date.How we built the knowledgebase
The knowledgebase is built offline by a dedicated builder tool that processes documentation from a variety of sources and stores the results in a SQLite database. The builder supports several input formats, including Markdown, HTML, reStructuredText, DocBook XML, and the SGML format used by the official PostgreSQL documentation. Each format is converted to clean Markdown before chunking, with format-specific handling to preserve the structure of the original content.The sources themselves can be git repositories or local filesystem paths, which makes the system flexible enough to index far more than just product documentation. For git repositories, the builder clones each one and checks out the appropriate branch or tag for each version. Local paths can point at anything on the filesystem, including exported blog posts, internal support knowledge base articles, or runbooks that your team has accumulated over time. If it can be converted to Markdown, HTML, or one of the other supported formats, it can go into the knowledgebase.A single configuration file defines all the documentation sources, and we currently index documentation for PostgreSQL versions 14 through 18, several versions of pgAdmin, and a range of pgEdge products including Spock, the RAG Server, the Postgres MCP Server, pgEdge Platform, PostGIS, pgvector, pgBouncer, and pgBackRest. But the same mechanism works equally well for your own content. A team that maintains a collection of blog posts about their database architecture, or an internal wiki with troubleshooting guides and operational procedures, can add those as local path sources and have them appear alongside the official product documentation in the knowledgebase. The LLM doesn't distinguish between the two; it simply searches the entire corpus and returns whatever is most relevant to the query.Chunking
Converting whole documents into something useful for semantic search requires breaking them into chunks that are small enough to be meaningful as individual search results but large enough to carry sufficient context. We use a two-pass hybrid algorithm that preserves the structural elements of the source documents.In the first pass, the algorithm parses the Markdown content into structural elements: code blocks, tables, lists, blockquotes, and paragraphs. It never splits within a structural element, because a code block that's been cut in half is useless as a search result. Instead, it splits at the boundaries between elements, targeting around 250 words per chunk. When an individual element exceeds the target size, it uses type-specific splitting strategies. Code blocks split at line boundaries with fencing re-added to each piece. Tables split at row boundaries with the header row preserved in each chunk. Lists split at top-level item boundaries, and paragraphs split at sentence boundaries.The second pass merges undersized chunks. Any chunk smaller than 100 words is merged with an adjacent chunk, provided the combined result doesn't exceed 300 words or 3,000 characters. The size constraints are deliberately conservative to maintain compatibility with Ollama models that have lower token limits, but they also happen to produce chunks that work well with all the embedding providers we support.One detail that turned out to be more important than we expected is heading hierarchy tracking. As the chunker works through a document, it maintains a stack of headings at each level. When it creates a chunk, it records the full heading path, so a chunk about OAuth configuration might carry the hierarchy "API Reference > Authentication > OAuth". This context significantly improves the quality of search results, because the embedding captures not just the content of the chunk but its position in the broader document structure.Embeddings
Each chunk is embedded using all three supported providers: OpenAI (using the model by default), Voyage AI (using ), and Ollama (using for fully offline operation). The embeddings from every provider are generated in parallel and stored together as compact float32 binary blobs in the SQLite database, which is considerably more space-efficient than storing them as JSON arrays.The reason for embedding with all three providers at build time is purely practical. By shipping a knowledgebase database that already contains OpenAI, Voyage AI, and Ollama embeddings side by side, the system administrator installing the MCP server can simply choose whichever embedding provider suits their environment. An organisation that uses OpenAI for everything can use the OpenAI embeddings. A team that needs fully offline operation can use the Ollama embeddings without having to regenerate the entire database themselves. At query time, the tool automatically selects the embeddings that match the configured provider, with a smart fallback to other providers if the preferred one happens to be missing for a particular chunk.The builder is incremental. It uses SHA-256 checksums to detect which source files have changed since the last build, and only re-processes files that are new or modified. It also deduplicates across versions, since documentation that hasn't changed between PostgreSQL releases doesn't need to be chunked and embedded again. For a full build covering all PostgreSQL versions from 14 to 18 plus all the pgEdge products, the result is a database of roughly 150,000 chunks that takes around 25 to 50 minutes to generate embeddings for using the cloud providers.How the LLM uses the knowledgebase
The knowledgebase is exposed to the LLM as a single MCP tool called . The tool accepts a natural language query and returns the most semantically similar chunks from the database. Behind the scenes, it converts the query into a vector embedding using whichever provider is configured for the MCP server, then calculates cosine similarity against the corresponding embeddings stored in the knowledgebase and returns the top results.The tool supports filtering by product name and version, which is important both for relevance and for token efficiency. If the user is asking about Spock replication, there's no point returning chunks from the PostgreSQL 14 documentation or the pgBouncer manual. The LLM can also call the tool with a parameter to discover what documentation is available before performing a search, which prevents it from guessing at product names that need to match exactly.A typical interaction looks something like this. The user asks a question about configuring Spock multi-master replication. The LLM recognises that this is a topic it may not have reliable training data for, so it calls with set to true. It sees that documentation for Spock 5.0.4 is available, and calls the tool again with a targeted query and the product name filter. The tool returns the five most relevant chunks from the Spock documentation, which the LLM reads, synthesises, and presents to the user as a coherent answer with accurate configuration details and version-specific information.The key insight is that the LLM doesn't need to know about Spock in advance. It just needs to know that the tool exists and that it can search for documentation on products it isn't confident about. The tool descriptions include guidance that encourages this behaviour, and in practice we find that LLMs are quite good at recognising when they're uncertain and reaching for the knowledgebase rather than guessing.What makes this different from generic RAG
The distinction between the knowledgebase and a generic RAG setup is worth drawing out. A general-purpose RAG system typically indexes whatever documents you throw at it and returns results based purely on semantic similarity. The knowledgebase is more opinionated. It understands the concept of products and versions, so it can filter results to a specific release. It uses a chunking algorithm that was designed specifically for technical documentation, preserving code blocks, tables, and heading hierarchies rather than splitting blindly on token counts. And because it's integrated into the MCP tool framework, the LLM can use it alongside the database query tools in the same conversation, checking the documentation for a feature before writing a query that uses that feature. The practical difference is that the LLM can give accurate, version-specific answers about products and features that are completely absent from its training data. That's not something you get from prompt engineering or fine-tuning, because neither approach can inject knowledge about a product that was released after the model was trained. The knowledgebase is simply the most practical way to bridge the gap between what the model knows and what the user needs.Try it yourself
The pgEdge Postgres MCP Server is open source under the PostgreSQL licence, and the knowledgebase builder and search tool are included. You can build a knowledgebase from your own documentation sources, or use the pre-built database that ships with the project's releases. Full documentation is available at docs.pgedge.com.Lætitia AVROT: Why Your HA Architecture is a Lie (And That's Okay)
WHAT MEANING MEANS: BUSINESS RULES, PREDICATES, CONSTRAINTS, AND SEMANTIC CONSISTENCY
“If we step back and look at what RDBMS is, we’ll no doubt be able to conclude that, as its name suggests (i.e., Relational Database Management System), it is a system that specializes in managing the data in a relational fashion. Nothing more. Folks, it’s important to keep in mind that it manages the data, not the MEANING of the data! And if you really need a parallel, RDBMS is much more akin to a word processor than to an operating system. A word processor (such as the much maligned MS Word, or a much nicer WordPress, for example) specializes in managing words. It does not specialize in managing the meaning of the words ... So who is then responsible for managing the meaning of the words? It’s the author, who else? Why should we tolerate RDBMS opinions on our data? We’re the masters, RDBMS is the servant, it should shut up and serve. End of discussion.” --Alex Bunardzik, Should Database Manage The Meaning?
Umair Shahid: PostgreSQL on Kubernetes vs VMs: A Technical Decision Guide
If your organization is standardizing on Kubernetes, this question shows up fast:
“Should PostgreSQL run on Kubernetes too?”
The worst answers are the confident ones:
- “Yes, because everything else is on Kubernetes.”
- “No, because databases are special.”
Both are lazy. The right answer depends on what you’re optimizing for: delivery velocity, platform consistency, latency predictability, operational risk, compliance constraints, and, most importantly, who is on-call when things go sideways.
I have seen PostgreSQL run very well on Kubernetes. I’ve also seen teams pay a high “complexity tax” for benefits they never actually used. This post is an attempt to give you a technical evaluation you can use to make a decision that fits your environment.
Start with the real question: are you running a database, or building a database platform?
This is the cleanest framing I have found:
- Running a database: You have a small number of production clusters that are business-critical. You want predictable performance, understandable failure modes, straightforward upgrades, and clean runbooks.
- Building a database platform: You want self-service provisioning, standardized guardrails, GitOps workflows, multi-tenancy controls, and a repeatable API so teams can spin up PostgreSQL clusters without opening tickets.
Kubernetes shines in the second world. VMs shine in the first.
Yes, you can do either on either platform. But the default fit differs.
A neutral comparison model: 6 dimensions that actually matter
Here is a practical rubric you can use in architecture reviews.
If you want a quick decision shortcut:
If your main goal is self-service and standardization, Kubernetes is compelling. If your main goal is predictable performance and lower operational surface area, VMs metal are compelling.
What Kubernetes adds (and why it’s both good and risky)
Kubernetes wasn’t designed primarily for databases. It was designed for scheduling workloads, handling health checks, rolling updates, and service discovery. PostgreSQL can run well there, but you typically stack multiple control layers:
- Stateful identity and scheduling
- Persistent volumes
- CSI/storage drivers
- Operators for lifecycle management
- Sidecars for backups/metrics/log shipping
That’s not inherently bad. It’s powerful. But each layer is another thing to understand, upgrade, monitor, and debug. There is also the ‘agony of choice’ when selecting the operator for lifecycle management. There are quite a few available, and none are perfect.
The biggest Kubernetes “gotcha” for PostgreSQL isn’t that it doesn’t work. It’s that when something goes wrong, the failure analysis can shift from “what is Postgres doing?” to “which Kubernetes subsystem is influencing Postgres right now?”
A very common pattern: a performance incident that starts as “write latency spiked” turns out to be tied to eviction behavior, scheduling pressure, or storage-layer hiccups. Those are solvable problems, but only if you already have deep Kubernetes operational maturity.
What VMs give you (and what they don’t)
VMs are boring in the best way: fewer abstraction layers between PostgreSQL and the hardware.
That usually means:
- More predictable latency (especially disk + network)
- Easier kernel-level tuning (huge pages, I/O scheduler, NUMA considerations)
- Simpler operational failure analysis (“the host is slow” is a real thing you can measure and act on)
- More straightforward incident response for teams that already have VM/host tooling
But VM isn’t “free” either. The cost shows up in different places:
- Slower provisioning and less self-service
- More configuration drift risk (“snowflake servers”)
- More manual day-2 operations unless you build good automation
- Higher discipline required for patching, backups, and failover testing
The platform might be simpler; the process still needs maturity.
The performance reality: storage and network decide more than “K8s vs VM”
Most “Postgres on Kubernetes is slow” stories are really one of these:
- The storage class wasn’t suited for database workloads.
- CPU throttling or noisy neighbor effects were introduced through cgroups / limits / oversubscription.
- Network paths became less predictable (overlay, MTU issues, cross-zone routing).
- Failover / restart behavior wasn’t tested under real load.
Storage: the durability and jitter problem
PostgreSQL is very sensitive to storage behavior because it relies heavily on fsync semantics, WAL throughput, and predictable latency for sync writes. On bare metal or a well-provisioned VM, you can often get very stable performance by:
- Using fast SSD/NVMe
- Separating WAL and data volumes when appropriate
- Benchmarking with fio and Postgres tools (pg_test_fsync) before you commit to architecture
On Kubernetes, you can do this too, but you must be intentional:
- Prefer storage classes built for sustained IOPS and latency stability (not just “it supports PVCs”)
- Validate snapshot/restore behavior end-to-end (because snapshots that exist but can’t restore correctly are theatre)
- Consider dedicated node pools and careful volume placement if you’re chasing low jitter
Network: the “multi-region makes everything harder” lesson
Replication lag is a good example of why network matters more than platform ideology. In one benchmark study1 (single-region vs multi-region), average replication lag in single-region was around a few milliseconds, while multi-region averaged tens of milliseconds with occasional spikes under load. The big takeaway: geography and network dominate lag behavior far more than whether you run inside a pod or on a VM.
So if your decision is driven by “we want multi-region active-active,” focus on replication architecture and network reality first. Kubernetes won’t save you from physics.
Reliability and HA: Kubernetes gives you rescheduling, not correctness
A controversial statement that’s still true:
Kubernetes gives you rescheduling. PostgreSQL needs correctness.
If a Postgres pod dies, Kubernetes will restart it. Great. But high availability for PostgreSQL is about:
- avoiding split brain
- promoting the right node at the right time
- fencing the old primary
- ensuring replicas are consistent
- ensuring client traffic shifts cleanly
- ensuring backups and restore paths are proven
Kubernetes can help you automate that with mature operators. VMs can help you automate it with mature HA tooling (Patroni/repmgr + a DCS + load balancers, etc.). In both cases, correctness comes from your HA design, your fencing strategy, and your tests, not from the platform’s marketing.
When Kubernetes is a strong fit for PostgreSQL
Kubernetes becomes a very rational choice when:
1. You already run a mature Kubernetes platform
- You have stable storage classes
- You have strong observability
- You have SREs who understand scheduling, disruption, and capacity planning
2. You want an internal “Postgres-as-a-service” model
- Developers request databases via a ticket/API and get guardrails by default
- Standardized backups, monitoring, parameter baselines, and security policies
3. You need many isolated Postgres clusters
- Multi-tenant environments where per-tenant isolation is valuable
- Frequent creation/destruction of clusters (CI, preview environments, ephemeral staging)
4. Your org operates with GitOps discipline
- Declarative config changes
- Reviewable diffs
- Automated drift detection
In these cases, the platform benefits can outweigh the complexity, because you’re actually using the platform benefits.
When VMs are a stronger fit
VMs tend to be the better choice when:
1. Your Postgres cluster is “crown jewel” infrastructure
- Latency-sensitive OLTP
- Predictable I/O behavior matters more than provisioning speed
2. You don’t have Kubernetes specialists on-call
- The fastest path to reliability is fewer moving parts, not more automation
3. You’re running a small number of large databases
- Dedicated instances, tuned for workload
- Scaling is mostly vertical and carefully planned
4. You need tight control over kernel + host settings
- NUMA behavior, huge pages, I/O scheduling, direct-attached NVMe, etc.
If you’re in this world, “boring infrastructure” is a feature.
Two reference architectures you can copy
Option A: Kubernetes with an operator (platform-oriented)
Key design choices:
- Use a mature Postgres operator for day-2 operations (backups, failover, upgrades)
- Use dedicated node pools for Postgres
- Use pod anti-affinity so replicas land on different nodes
- Use PodDisruptionBudgets so maintenance doesn’t take you down
- Keep backups off-cluster (object storage) and run restore drills
And your operator-managed cluster spec should include:
- explicit resource requests
- storage class selection
- monitoring enablement
- backup configuration
- replication settings
Option B: VMs with Patroni (database-runbook oriented)
Key design choices:
- 3-node cluster (1 primary, 2 replicas)
- Patroni for HA with a DCS (etcd/Consul)
- HAProxy for routing writes to primary and reads to replicas (optional)
- PgBouncer for connection pooling
- pgBackRest (or similar) for backups and PITR
- Monitoring stack: node metrics + Postgres metrics + log analysis
This model is widely understood, auditable, and tends to fail in more predictable ways.
Common gotchas (the ones that create 2am incidents)
Kubernetes gotchas
1. CPU limits causing throttling
You can meet “CPU request” but still get throttled under burst if limits are too tight.
2. Pod evictions during load
Especially if PDBs, priorities, and eviction policies aren’t designed for stateful workloads.
3. Storage that looks fast on paper but has latency spikes
Sustained performance is what matters, not peak IOPS marketing.
4. Backups that exist but restores that fail
Test restores on a schedule as a drill, not during an incident.
5. Operator upgrades as a hidden dependency
Your database lifecycle now depends on the operator lifecycle.
VM gotchas
1. Unvalidated failover
You “have HA” but haven’t practiced it under load with real application behavior.
2. Backup confidence without restore drills
The only backup that matters is the one you restored successfully.
3. Configuration drift
Two replicas that aren’t actually identical are a slow-motion outage.
4. Noisy neighbor on shared hypervisors
“It’s on a VM” doesn’t mean you own the underlying contention story.
5. OS patching and reboots without a runbook
Routine maintenance becomes risky without clear procedures.
The punchline: choose the platform that matches your org’s operating model
My take is simple:
- Kubernetes is excellent when you’re building a database platform.
- VMs are excellent when you’re running a database.
Both can be production-grade. Both can be disasters. The difference is whether your organization is set up to operate the platform you choose.
If you want one practical recommendation that avoids regret, this is it:
Run dev/test Postgres on Kubernetes if it helps delivery speed. Run production Postgres where you can guarantee predictable storage, clear failure modes, and strong operational ownership. That might be Kubernetes, or it might not.
Related
[1] Benchmark Study on Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
[2] Is Your PostgreSQL Deployment Production Grade?
[4] Clustering in PostgreSQL: Because One Database Server is Never Enough (and neither is two)
[5] Database in Kubernetes: Is that a good idea?
[6] Databases on K8s — Really? [Part 1] [Part 2] [Part 3] [Part 4]
The post PostgreSQL on Kubernetes vs VMs: A Technical Decision Guide appeared first on Stormatics.
Umair Shahid: PostgreSQL, MongoDB, and what “cannot scale” really means
Last week, I read The Register’s coverage of MongoDB CEO Chirantan “CJ” Desai telling analysts that a “super-high growth AI company … switched from PostgreSQL to MongoDB because PostgreSQL could not just scale.” (The Register)
I believe you can show the value of your own technology without tearing down another. That is really what this post is about.
I run a company that lives inside PostgreSQL production issues every day. We deal with performance, HA and DR design, cost optimisation, and the occasional “we are down, please help” call from a fintech or SaaS platform. My perspective comes from that hands-on work, not from a quarterly earnings script.
Scaling stories are always more complex than a soundbite
When a CEO says, “This AI customer could not scale with PostgreSQL,” a lot of details disappear:
- What did the schema and access patterns look like?
- How was the application managing connections and transactions?
- What was running underneath? Cloud managed PostgreSQL, a home-rolled cluster, or something else?
- Were vertical and horizontal scaling options fully explored?
- Did the team have PostgreSQL specialists involved, or was everything on default settings?
None of that reduces MongoDB’s strengths. Document databases are excellent for certain workloads: highly variable document shapes, rapid iteration, and development teams that think in JSON first. MongoDB has earned its place.
My concern is with the narrative that PostgreSQL, as a technology, “cannot scale,” especially given that it is the most popular database among professional developers and has overtaken MongoDB in the DB-Engines ranking over the last decade. (DB-Engines)
The method for calculating this ranking is detailed here: https://db-engines.com/en/ranking_definition.
Every database has strengths. Every database has trade-offs. Real-world results come from matching those strengths to the workload and from running the system with care.
What PostgreSQL offers for scaling
PostgreSQL scaling is a spectrum. When someone says “it does not scale,” the real question is usually “which dimension of scale, under which design?”
1. Vertical scale: one node, serious throughput
A single well-configured PostgreSQL instance on modern hardware handles:
- Hundreds of thousands of transactions per second
- Tens of terabytes of data on a single node
This is not a marketing copy; PostgreSQL consultants have demonstrated these ranges repeatedly in the field.
In practice, many SaaS and fintech workloads have lived entirely within this vertical envelope for years. A typical high-end OLTP node may look like:
# Excerpt from postgresql.conf on a busy OLTP system
shared_buffers = '32GB'
Effective_cache_size = '96GB'
work_mem = '64MB'
Maintenance_work_mem = '4GB'
max_connections = 500
wal_level = replica
max_wal_senders = 20
synchronous_commit = on
Layer that on top of fast NVMe storage, tuned Linux parameters (huge pages, I/O schedulers, TCP settings), and connection pooling (PgBouncer or Pgpool-II), and you already have a very capable single node.
2. Horizontal scale: scale-out patterns that teams use today
When a single node envelope is not enough, PostgreSQL offers multiple, well-understood patterns:
Read scaling with replicas
- Native streaming replication provides multiple read replicas.
- Connection routing at the application or proxy layer directs reads to replicas and writes to the primary.
Partitioning and sharding
Start with native partitioning:
CREATE TABLE events (
tenant_id bigint,
created_at timestamptz,
payload jsonb
) PARTITION BY RANGE (created_at);
CREATE TABLE events_2025_q4
PARTITION OF events
FOR VALUES FROM (‘2025-10-01’) TO (‘2026-01-01’);
This reduces index and table bloat, keeps hot data in smaller structures, and improves cache efficiency. From there, you can shard by tenant or region using extensions such as Citus, which turn PostgreSQL into a distributed cluster while keeping SQL semantics.
Specialised distributed PostgreSQL services
The ecosystem now includes fully managed, distributed PostgreSQL or PostgreSQL-compatible systems:
- AWS Aurora
- Google AlloyDB
- Microsoft HorizonDB on Azure
- Third-party systems such as PGD, TimescaleDB, YugabyteDB, CockroachDB
Even The Register article that quoted the “could not scale” remark listed these services as answers to concerns about PostgreSQL scalability. (The Register)
These platforms exist precisely because organisations want PostgreSQL semantics and ecosystem benefits at a very large scale.
None of this is science fiction. It is what many of us run for clients every day.
AI workloads are still workloads
A lot of the rhetoric here anchors on “AI workloads,” as if that label alone demands a completely different class of database.
When you strip away the hype, most AI-heavy platforms combine:
- High-volume event ingestion (traces, actions, telemetry)
- Vector search over embeddings
- Metadata and configuration storage
- Analytical queries over usage and performance
PostgreSQL handles these patterns very well when you apply the right architecture:
- The pgvector extension provides vector types and indexes.
- Time-series and event tables benefit from partitioning and, if appropriate, time-series extensions. (Tiger Data)
- Read replicas and distributed variants cover high-volume read access and multi-region requirements.
- Strong transactional semantics and mature tooling simplify the “critical path” parts of AI platforms, such as billing, entitlements, and configuration.
MongoDB also effectively supports AI workloads when the document model aligns with the system’s needs. The real design work lies in balancing transactional consistency, query patterns, schema evolution, and operational comfort for the team.
“AI workload” is a description of behaviour, not a free pass to declare one database category the winner.
What I see in real PostgreSQL production environments
Because I run a PostgreSQL-only services firm, I tend to see systems when they are under pressure:
- A fintech running a three-node PostgreSQL cluster sustaining high write rates and achieving 99.99% availability with automated failover and tested DR.
- A last-mile delivery platform built on Odoo and PostgreSQL, where careful indexing, parameter tuning, and query refactoring improved throughput by several orders of magnitude without a move to any new database.
- Multi-region architectures where we measured replication behaviour across regions and tuned topology, sync levels, and application retry logic until lag and failure modes stayed inside business SLAs.
In each case, PostgreSQL scaled just fine once the architecture matched the problem:
- Hot paths isolated into lean tables.
- Background jobs separated from request paths.
- Connection limits enforced with pooling.
- Replication configured deliberately instead of left at defaults.
The blocker was rarely PostgreSQL as a core technology. The blocker was design, implementation, or operational discipline.
Where MongoDB is a strong choice
To keep this honest: there are scenarios where MongoDB is a very reasonable first choice, even in systems that already use PostgreSQL:
- Highly polymorphic document payloads with frequent structural changes, where strict relational modelling would slow teams down.
- Use cases dominated by document-level access with minimal cross-document joins.
- Teams with deep MongoDB experience and an existing operational toolchain that they trust.
Choosing MongoDB in those scenarios is smart engineering.
My issue is with narratives that imply PostgreSQL is fundamentally unable to scale, rather than saying, “For this workload and this team, MongoDB was a better fit.”
Our industry benefits greatly when leaders frame their success that way.
Scaling is an engineering discipline, not a brand attribute
I believe responsible comparisons look more like this:
1. Start from the workload
- Request rate, throughput, and latency targets
- Data model, relationships, and query patterns
- Consistency and durability requirements
2. Consider team capabilities
- SQL and relational experience
- Existing operational muscle around PostgreSQL, MongoDB, or both
- Appetite for running distributed systems
3. Match patterns, then products
- Vertical scale, replicas, and partitioning may cover years of growth.
- When you truly outgrow those, consider distributed PostgreSQL services or specialised databases, including MongoDB, that address the specific gap.
4. Benchmark honestly
- Use the same hardware class, realistic schemas, and realistic queries.
- Measure p95 and p99 latencies, plan stability, failure behaviour, and operational effort, not only headline QPS.
This is where meaningful decisions happen, far away from any sentence that says “X cannot scale.”
For CTOs and Database Engineers reading this
If you are responsible for a PostgreSQL deployment and you are now second-guessing your choices because of a quote from an earnings call, you are exactly who I am writing for.
A few practical questions you can ask yourself before considering a move from PostgreSQL to anything else:
- Have we exhausted vertical scaling on a well-tuned primary with fast storage?
- Are we using read replicas where they make sense?
- Is our schema designed for our access patterns, or is the database carrying application-layer problems?
- Have we explored partitioning for our largest, hottest tables?
- Are we testing and measuring, or reacting to anecdotes?
If the answer to several of these is “no,” you still have a lot of PostgreSQL headroom.
My closing thought
You can show MongoDB’s value without framing PostgreSQL as a dead end. You can show PostgreSQL’s value without dismissing MongoDB.
As practitioners, we owe it to the people who depend on these systems to keep the conversation grounded in architecture, workload patterns, and operational reality, rather than headlines.
The post PostgreSQL, MongoDB, and what “cannot scale” really means appeared first on Stormatics.
Laurenz Albe: The bastard DBA from hell

© Laurenz Albe 2025
While I sometimes write about real-life problems I encounter in consulting sessions (see this example), I admit that curiosity and the desire to play are the motivation for many of my articles (like this one). So I am glad that a friend of mine, whose job as a DBA puts him on the front line every day, offered to let me quote from his (unreleased) memoirs. Finally, this is material that I hope will connect with all the hard-working souls out there! My friend insists on anonymity, hence the “bastard DBA from hell” in the title.
The bastard DBA from hell fights the deadlocks
I like being a DBA. The only problem are the people using the databases.
A week ago I get a call from accounting. The guy sounds like I stole his sandwich, “We get deadlocks all the time.”
“Well, don't do it then.”, I tell him. I should have known that this kind of person wouldn't listen to good advice.
“It's your Postgre system that gives me these deadlock errors. So you fix them”.
“As you wish.”. >clickety click< “Ok, you shouldn't get any more deadlock errors now.”
He mumbles something and hangs up. It didn't sound like “thank you”. That's fine with me. After all, I set his lock_timeout to 500.
The bastard DBA from hell fights table bloat
Yeaterday the guy from accounting called me again, in that accusing, complaining tone, “Our disk is 95% full.”
“I know, I got an alert from the monitoring system. Actually, I sent you guys an e-mail about it. 10 days ago.”
“Can't you fix that?”
“Do you want me to extend the file system?”
“We have no more data in there than before. So why do we need a bigger disk?”
I tell him “hang on” and log into their system. >clickety click<
“Well, you got a 350GB table there that's taking up all the space. But it's 99% dead data... Did you create that prepared transaction a month ago and never commit it?” >clickety click< “There, gone now. Just run a VACUUM (FULL) on the table and you'll be good.”
I thought I was rid of him, but I should have known that was not to be. The same voice, an hour later:
“VACUUM (FULL) says ‘out of disk space’.”
“Again — do you want me to extend the file system?”
I should have known that it was a mistake to be nice to the whiner. He decides to get nasty. Big mistake.
“Do your job and clean up the database.”
“As you wish.” >clickety click< “Ok, now you have got enough free disk space.”.
I hang up. There are few things as satisfying as a quick TRUNCATE when things get out of hand!
The post The bastard DBA from hell appeared first on CYBERTEC PostgreSQL | Services & Support.







