
Sm.art.graphicstudio sent me this comparison between the soon ot be released new Olympus 300mm MFT lens and the 7500 Euro expensive Canon 400mm lens.
Thanks!

Sm.art.graphicstudio sent me this comparison between the soon ot be released new Olympus 300mm MFT lens and the 7500 Euro expensive Canon 400mm lens.
Thanks!
LI would love to see a Christian version. What is the temperature in Heaven? ;-)
As you may have noted from my previous blog, the last few months were busy in getting Postgres-XL up-to-date with the latest 9.5 release of PostgreSQL. Once we had a reasonably stable version of Postgres-XL 9.5, we shifted our attention to measure performance of this brand new version of Postgres-XL. Our choice of the benchmark is largely influenced by the ongoing work on the AXLE project, funded by the European Union under grant agreement 318633. Since we are using TPC BENCHMARK™ H to measure performance of all other work done under this project, we decided to use the same benchmark for evaluating Postgres-XL. It also suits Postgres-XL because TPC-H tries to measure OLAP workloads, something Postgres-XL should do well.
Once the benchmark was decided, another big challenge was to find right resources for testing. We did not have access to a large cluster of physical machines. So we did what most would do. We decided to use Amazon AWS for setting up the Postgres-XL cluster. AWS offers a wide range of instances, with each instance type offering different compute or IO power.
This page on AWS shows various available instance types, resources available and their pricing for different regions. It must be noted that the prices and availability may vary from region to region, so its important that you check out all regions. Since Postgres-XL requires low latency and high throughput between its components, it’s also important to instantiate all instances in the same region. For our 3TB TPC-H we decided to go for a 16-datanode cluster of i2.xlarge AWS instances. These instances have 4 vCPU, 30GB of RAM, and 800GB of SSD each, sufficient storage for keeping all the distributed tables, replicated tables (which take more space with increasing size of the cluster), the indexes on them and still leaving enough free space in temporary tablespace for CREATE INDEX and other queries.
The benchmark contains 22 queries with a purpose to examine large volumes of data, execute queries with a high degree of complexity and give answers to critical business questions. We would like to note that the complete TPC Benchmark™ H specification deals with variety of tests such as load, power, and throughput tests. For our testing, we have only run individual queries and not the complete test suite. TPC Benchmark™ H is comprised of a set of business queries designed to exercise system functionalities in a manner representative of complex business analysis applications. These queries have been given a realistic context, portraying the activity of a wholesale supplier to help the reader relate intuitively to the components of the benchmark.
The components of the TPC-H database are defined to consist of eight separate and individual tables (the Base Tables). The relationships between columns of these tables are illustrated in the following diagram.
Legend:
We analyzed all 22 queries in the benchmark and came up with the following data distribution strategy for various tables in the benchmark.
| Table Name | Distribution Strategy |
| LINEITEM | HASH (l_orderkey) |
| ORDERS | HASH (o_orderkey) |
| PART | HASH (p_partkey) |
| PARTSUPP | HASH (ps_partkey) |
| CUSTOMER | REPLICATED |
| SUPPLIER | REPLICATED |
| NATION | REPLICATED |
| REGION | REPLICATED |
Note that LINEITEM and ORDERS which are the largest tables in the benchmark are often joined on the ORDERKEY. So it makes a lot of sense to collocate these tables on the ORDERKEY. Similarly, PART and PARTSUPP are frequently joined on PARTKEY and hence they are collocated on the PARTKEY column. Rest of the tables are replicated to ensure that they can be joined locally, when needed.
We compared results obtained by running a 3TB TPC-H Load Test on PostgreSQL 9.6 against the 16-node Postgres-XL cluster. The following charts demonstrate the performance characteristics of Postgres-XL.
The above chart shows the time taken to complete various phases of a Load Test with PostgreSQL and Postgres-XL. As seen, Postgres-XL performs slightly better for COPY and does a lot better for all other cases. Note: We observed that the coordinator requires a lot of compute power during the COPY phase, especially when more than one COPY streams are running concurrently. To address that, the coordinator was run on a compute optimised AWS instance with 16 vCPU. Alternatively, we could have also run multiple coordinators and distribute compute load between them.
We also compared the query run times for 3TB benchmark on PostgreSQL 9.6 and Postgres-XL 9.5. The following chart shows performance characteristics of the query execution on the two setups.
We observed that on average queries ran about 6.4 times faster on Postgres-XL and at least 25% of the queries showed almost linear improvement in performance, in other words they performed nearly 16 times faster on this 16-node Postgres-XL cluster. Furthermore at least 50% of the queries showed 10 times improvement in performance. We further analysed the query performances and concluded that queries that are well partitioned across all available datanodes, such that, there is minimal exchange of data between nodes and without repeated remote execution calls, scale very well in Postgres-XL. Such queries typically have a Remote Subquery Scan node at the top and the subtree under the node is executed on one or more nodes in parallel. Its also common to have some other nodes such as a Limit node or an Aggregate node on top of the Remote Subquery Scan node. Even such queries perform very well on Postgres-XL. Query Q1 is an example of a query that should scale very well with Postgres-XL. On the other hand, queries that require lots of exchange of tuples between datanode-datanode and/or coordinator-datanode may not do well in Postgres-XL. Similarly, queries that require many cross node connections, may also show poor performance. For example, you will notice that the performance of Q22 is bad as compared to a single node PostgreSQL server. When we analysed the query plan for Q22, we observed that there are three levels of nested Remote Subquery Scan nodes in the query plan, where each node opens equal number of connections to the datanodes. Further, the Nest Loop Anti Join has an inner relation with a top level Remote Subquery Scan node and hence for every tuple of the outer relation it has to execute a remote subquery. This results in poor performance of query execution.
While benchmarking Postgres-XL we learnt a few lessons about using AWS. We thought they will be useful for anyone who is looking to use/test Postgres-XL on AWS.
We are able to show, through various benchmarks, that Postgres-XL can scale really well for a large set of real world, complex queries. These benchmarks help us demonstrate Postgres-XL’s capability as an effective solution for OLAP workloads. Our experiments also show that there are some performance issues with Postgres-XL, especially for very large clusters and when the planner makes a bad choice of a plan. We also observed that when there are very large number of concurrent connections to a datanode, performance worsens. We will continue to work on these performance problems. We would also like to test Postgres-XL’s capability as an OLTP solution by using appropriate workloads.
Everyone can join the new Olympus Global Photo contest (Click here). The grand prize is the “Latest Olympus PEN Camera”. And we now know this will be the new PEN with integrated viewfinder 
One more thing: Read the Photo Tips by Peter Baumgarten on GetOlympus (Click here).
The FSF has been warning users of the dangers of the Trans-Pacific Partnership (TPP) for many years now. The TPP is an agreement negotiated in secret nominally for the promotion of trade, yet entire chapters of it are dedicated to implementing restrictions and regulations on computing and the Internet. In April of 2015, a leaked draft of the agreement revealed a whole host of problems. From extensions to the term of copyright, confusing provisions on software patents, and spreading the worst aspects of the Digital Millennium Copyright Act's (DMCA) Digital Restrictions Managment (DRM) provisions beyond the United States, the TPP negotiations were and are an attack on user freedom. In the U.S. at that time, the battle was to stop Trade Promotion Authority, which would fast-track passage of TPP in the U.S. once an accord was reached. We unfortunately lost that battle, and last month the TPP negotiations ended. On November 5th, the secret text of TPP was finally officially released to the public. Because of Trade Promotion Authority, the time we have left to stop TPP in the U.S. is extremely limited. For U.S. residents, there are only 90 days left before this trade agreement locks users in for possibly decades. For users in other TPP member countries, the time frame is not much better. The war wages on and the time to act is now.
One big reveal from the final publication was the addition of the Electronic Commerce chapter, which was not previously leaked. The chapter contains provisions similar to those found in the Trade in Services Agreement (TISA) that we wrote about previously. TPP requires that "No Party shall require the transfer of, or access to, source code of software owned by a person of another Party, as a condition for the import, distribution, sale or use of such software, or of products containing such software, in its territory." While government procurement is exempted from this rule under TPP ("This Chapter shall not apply to ... government procurement"), it would still mean that member countries could not pass a law requiring that imported consumer devices come with source code. The regulation would not affect freely licensed software, such as software under the GPL, that already comes with its own conditions ensuring users receive source code. Such licenses are grants of permission from the copyright holders on the work, who are not a "Party" to TPP. But even if the rule is limited, it is clearly an attack on the sharing of software and government policies to encourage it. This is yet another reason why we must stop TPP.
Unfortunately, as the similar language found in TISA shows, even if we are successful in stopping TPP, other international trade agreements lie in wait that would extend these problems all around the world as well as produce many of their own. Dozens of countries around the world are ensnaring each other in agreements that threaten user's fundamental liberty. If you live in a country that is not a member of TPP, now is not the time for complacency. The short track to TPP approval may be overshadowing the agreements that are threatening you personally further down the line, but now is the perfect time to shine a light on them as well.
TPP, and the ongoing fight against all international "trade" agreements that threaten freedom, is one of the most urgent issues facing users today. The work we have done over the years against DRM and software patents will be set back if we do not stop these agreements from coming to pass. But we are not alone in this fight. Organizations all around the world are rallying to the cause to stop TPP and agreements like it. The first step for now are a series of rallies taking place in Washington D.C. from November 14th to the 18th. The Electronic Frontier Foundation will be hosting two days of action.
We hope that you can join our friends at EFF on November 16th in the Washington D.C., and on November 17th in Washington D.C. and around the world. This multi-day rally is just the start of the steps we need to take to stop TPP. Here's what you need to do:
This post was originally authored for the UTCS Facebook group, in the context of discussing the role of formalism in software "engineering" as regards this article from BuisnessInsider arguing that "The title “engineer” is cheapened by the tech industry."
Another article in the same vein.
The argument at hand is that "Software Engineering" isn't "real engineering" because we play it fast, loose and don't use formalisms to the extent that other fields do. So lets consider formalisms in the context of computer science.
They're pretty great. We can prove that functions are with respect to the proof criteria done and correct for all time. Sorting algorithms, Boyer-Moore, data structures and soforth do well defined things correctly and fall in this category.
On the other hand, most of the work we've done in the direction of formalisms is to be polite ultimately useless. We can't do general proof, forget automate general proof (by application of the halting problem). Even without the halting problem we have logic incompleteness. So really I think that saying we have formalisms in general software engineering is pretty silly. The classical "formal" tools we have suffer from immense limitations and while some heros may succeed with them anyway, they are far from the realms of mortals.
So what of the claim that we do "software engineering"? If you dig very far into software engineering literature, you'll find that it's kinda a sham. I'll direct you here to Parnas among other authors such as Brooks who was previously mentioned to support this point. The fundamental problem in software engineering isn't designing or writing software. That's easy, we teach undergrads and nontechnical people to program all the time. The real problem we face is coping with the long term consequences of change in a software system.
Users don't really understand what it is that they want most of the time. If you look at the development of large software systems, you'll find a huge amount of learning occurs as you build something. As you write a program, you are forced to explore the problem domain thereof. In doing so, you will unavoidably learn more about the problem and discover things you should have designed differently. Worse, users seeing intermediary results will get a better understanding for what it is that they want and start asking for it. So we are fated to loose from both sides. The only way to win is to reduce your iteration time both in terms of cleaning up your mess and getting something in front of your users faster. This is not somehow a problem unique to software development, it is shared by all engineering. The real draw of modern automated manufacturing is just this, that design improvements can be rolled out faster.
What is unique to software is that when we "deliver" I think it's silly to say that whatever we've built is "done". When we say that we're "done" with a ship and the last rivet is fitted, the cost of changing that ship is enormous. Similarly when the last brick of a bridge is laid, the cost to tear it up and start over again is prohibitive. In contrast, we work with programmable machines which practically speaking never decay and whose behavior is anything but fixed at the time of "manufacture".
Because of the huge change costs, vast amounts of time in traditional engineering is spent gathering and evaluating design criteria because it must be. The price of getting it wrong is simply too high. Unless you're a Defense contractor you can't afford to deliver the wrong thing late and over budget (ayyyy). Instead you have to figure out what the right thing is and how to build it right the first time on time and under budget if possible. Case in point, during the development of the digital telephone, Bell Labs spent a huge amount of money doing user studies. They examined different dialing keypad layouts and discovered the layout we use today. They tested human hearing to understand how much (read how little) information was required to support acceptable calls. And on and on, mostly done ahead of time due to the huge investments involved.
We have the opposite problem. It costs us next to nothing to change our design as our design criteria evolve. This means that we can afford to play fast and loose. We can always patch it later. Look at Tesla who's famously OTA updating their fleet of cars to of all things add features that weren't present when their customers purchased the cars. It's literally more time (read cost) efficient for us to bang out a release and get it in front of users to try and refine our conception of what they want than it is for us to gather requirements and do an in depth design.
An interesting case study is Chris Granger's programming language Eve which was "designed" (maybe discovered is the better word) by both doing a ton of research on prior art and user study after user study to try and understand how users were learning and making use of Eve. Critically, changes were made not just to the UI but to the semantics of the language as exposed to users to try and smooth out the pain points while preserving utility. This stands in stark contrast to langauges like Pascal which came more or less fully formed into the world and have changed little in response to requirements or user studies.
Similarly modern "continuous deployment" and "continuous testing" infrastructure is laudable because it helps programmers get software out the door faster. This tighter feedback loop between users and programmers helps us refine the "right thing" and deliver that thing. Faster is better. Sound familiar?
Move fast and break things
This is where we need to come back to formalism. Because this really isn't working out for facebook and it won't work out for you. While I will yet write an article on breaking changes and the necessity of really breaking software, the majority of breaking changes in the above sense are bugs and are accidental rather than essential. We are, as the other engineers accuse us, really bad at doing software development. We need a way to cope with change and ensure stability at the same time. If we choose our abstractions well through experience and art, we can get some of this. But there are very few people with the art to make design decisions which will mitigate real unknown future change. What we really need are tools which help us control and understand the impacts of our changes.
One of these tools is obviously testing and code coverage. We can understand the impact of our changes or our failure to change in terms of what tests we've broken or failed to fix. We can also understand changes in terms of its impact on the coverage of our programs. If we add a bunch of new code that isn't tested, then our tests explore less of the behavior of our programs and are less valuable and the indicator of test coverage can tell us this. If we delete a bunch of code that wasn't tested or wasn't used, then we've reduced the state space of our program (which is a great thing!) and our coverage should go up to reflect this.
These are all really powerful tools, but they are more the practices and methodologies which have gotten us this bad name of not being real engineering. What of proof as a tool?
Our medium being inherently different and the cost mechanics also being different I think that any bemoaning of software correctness going out the window is plainly silly. Of course software won't be "correct". Much software is too complex for any meaningful "correctness" criterium to be established. It's far more efficient to build something that sorta works and iterate than to obsess over getting it right. Proof, small software shops that can iterate quickly are eating the big slow movers lunches if you care to look at the last few decades of our industry's history.
This suggests that formalism needs to take a different place in software engineering. It needs to exist not as an engine for absolute correctness, but as a tool to be used to help enable iteration. Type directed programming and error prediction in Haskell is I think a great and under explored use of this. If you want to make a change, make that change and run the typechecker (proof engine with regards to whatever properties you encoded in the type system) and all the errors you get are the places where your existing program needs to be changed in order to accomodate the desired change. This is not general correctness, this is not halting, this is just a tool for letting you as an engineer track the criteria which make what you're building good enough and whether they are preserved.
This conception of lightweight formalism(s) is I think the future of the industry. Real theorem proving has its place in hardware design and in other corners of the industry where the cost benefit tradeoffs are again different, but it's simply too effort intensive to be practicable for the "average" programmer working on yet another Uber for X.
So. Parnas. Fake it. Ship it. Think about it. Fake it again and again. Use formalisms to help you fake it. But formalism for any classical value of formalism is likely nonsense unless you have good reason otherwise.
What are we doing here by all this pushing around of constraints?
I think this is engineering.
^d
On 29 October 2015, the European Parliament adopted a report (2015/2635(RSP)), which condemned mass surveillance throughout Europe. While focusing primarily on legal precedents of data protection, Parliament proposed new recommendations to improve IT security by migrating to free software, as well as adding free software as a mandatory selection criterion in public IT procurement.
Specifically, Section 47 of the report:
Welcomes the steps taken so far to strengthen Parliament’s IT security, as outlined in the action plan on EP ICT Security prepared by DG ITEC; asks for these efforts to be continued and the recommendations made in the resolution fully and swiftly carried out; calls for fresh thinking and, if necessary, legislative change in the field of procurement to enhance the IT security of the EU institutions; calls for the systematic replacement of proprietary software by auditable and verifiable open-source software in all the EU institutions, for the introduction of a mandatory ’open-source’ selection criterion in all future ICT procurement procedures, and for efficient availability of encryption tools.This is a welcome change in rhetoric from the previous version of the resolution (2013/2188(INI) Section 91). The new version uses stronger language to push for migrating to Free Software as a way to improve transparency and security in EU IT systems. Although resolutions passed in the European Parliament are non-binding, passage of this report sends a strong message to the European Commission to include provisions for Free Software in future legislation.
front-page EuropeanParliament PolicySupport FSFE, join the Fellowship
Make a one time donation
Support FSFE, join the Fellowship
Make a one time donation
Developers are having a lot of fun creating new products for the Olympus AIR MFT camera! Here are two new toys:
1) Make.dmm is selling a “gun shutter” device for your Olympus AIR (see image on top).
2) And on DC.watch you can find a more “peaceful” shutter version 
Olympus AIR MFT camera is in Stock at Amazon US (Click here), Adorama (Click here), BHphoto (Click here) and on eBay in White (Click here) or Black (Click here).
–
Gun link via Photorumors.
New features in this version:
Object Inspector
Full details of current object nicely formatted
Includes all attachments to the current object
Tracks the cursor
Available from the Print View
Object Editor
Edit built in attributes
Edit attached features
Available from Inspector or by right click.
Create Palette Buttons for instantiating the current attached features
Score and Movement Properties Editor
Edit global properties of the Score
Edit global properties of the Movement
Switch between movements to edit properties
Create Palette Buttons for instantiating the current score or movement property
Staff and Voice Properties Editor
Edit global properties of the Staff
Edit global properties of the Voice
Create Palette Buttons for instantiating the current staff or voice property
Search and Replace
Search for rhythmic patterns
Edit at found pattern and continue
Search for note sequences
Wrap to staff start/next staff
Resume search
Score Layout Editor
Duplicate Movements with separate edits
Re-order Movements
Duplicate Staffs with separate edits
Staffs can be added from any movement into any other
Titles
Control of bold, italic and fontsize
Control of spacing
Comprehensive set of fields to set
Beaming Rules
Create scorewide beaming rules
Rules for multiple time signatures
Regular rules with exceptions done by example.
Preview of Text/Music/Fret Diagram/Chord Symbol Markup
Check the appearance before re-typesetting the score
Notehead Styles
Complete set of notehead styles
Set on individual notes or score-wide
Vertical Spacing Controls
Complete set of vertical spacing distance settings
Titles to System
System to System
… more
Improved MIDI output
Sustain Pedal effect
Bug Fixes
MIDI message lengths corrected for custom messages
User Manual
Many new sections added
Known issues for this release:
option -a is ignored unless -n is given.
The ornaments with accidentals above/below command is broken.
Binary packages can be downloaded via the denemo.org website.
Here are the compressed sources (from a mirror) :
http://ftpmirror.gnu.org/denemo/denemo-2.0.0.tar.gz
If automatic redirection fails, the list of mirrors is at:
http://www.gnu.org/order/ftp.html
Or if need be you can use the main GNU ftp server:
You can’t build a real-life system without caching.
That being said, it’s often the case that parts of the system you think are going to be slow aren’t. I’ve noticed a tendency to build out a huge stack of components (”we’ll have PostgreSQL, and Redis, and Celery, and Varnish, and…”) without actually measuring where the bottlenecks are.
Example: A counter.
Suppose you need a global counter for something. It needs to be super-fast, and available across all of the web front ends. It’s not transactional (you never “uncount” based on a rollback), but you do want it to be persistent.
Option 1: Drop Redis into the stack, use INCR to keep the counter, and have some other process that reads the counter and spills it into PostgreSQL, then have some other process that picks up the count when Redis starts and initializes it (or be smart enough to read from both places and add them when yo need it), and accept that there are windows in which you might use counts.
Option 2: Use SERIAL in PostgreSQL.
But option 2 is really really really slow compared to super-ultra-fast Redis, right?
Not really (test on an Amazon i2-2xlarge instance, client over local sockets, Python client):
INCRs in Redis: 824 seconds.SELECT nextval('') in PostgreSQL: 892 seconds.So: Slower. 6.8 microseconds per increment slower. And no elaborate Redis tooling.
So, build for operation, apply load, then decide what to cache. Human intuition about what might be slow is almost certainly wrong.
Postgres has been lacking something for quite a while, and more than a few people have attempted to alleviate the missing functionality multiple times. I’m speaking of course, about parallel queries. There are several reasons for this, and among them include various distribution and sharding needs for large data sets. When tables start to reach hundreds of millions, or even billions of rows, even high cardinality indexes produce results very slowly.
I recently ran across an extension called pmpp for Poor Man’s Parallel Processing and decided to give it a try. It uses Postgres’ dblink system to invoke queries asynchronously and then collates the results locally. This allows further queries on that result set, as if it were a temporary table.
Theoretically this should be ideal for a distributed cluster of shards, so let’s see what happens if we try this with our sensor_log table in that configuration:
CREATE TABLE sensor_log ( sensor_log_id SERIAL PRIMARY KEY, location VARCHAR NOT NULL, reading BIGINT NOT NULL, reading_date TIMESTAMP NOT NULL ); CREATE INDEX idx_sensor_log_location ON sensor_log (location); CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date); CREATE INDEX idx_sensor_log_time ON sensor_log ((reading_date::TIME)); CREATE SCHEMA shard_1; SET search_path TO shard_1; CREATE TABLE sensor_log (LIKE public.sensor_log INCLUDING ALL) INHERITS (public.sensor_log); CREATE SCHEMA shard_2; SET search_path TO shard_2; CREATE TABLE sensor_log (LIKE public.sensor_log INCLUDING ALL) INHERITS (public.sensor_log); CREATE SCHEMA shard_3; SET search_path TO shard_3; CREATE TABLE sensor_log (LIKE public.sensor_log INCLUDING ALL) INHERITS (public.sensor_log); CREATE SCHEMA shard_4; SET search_path TO shard_4; CREATE TABLE sensor_log (LIKE public.sensor_log INCLUDING ALL) INHERITS (public.sensor_log); |
The top sensor_log table in the public schema exists merely so we can query all the table sets as a whole without using a bunch of UNION statements. This should allow us to simulate how such a query would run without the benefit of parallel execution on each shard.
Now we need to fill the shards with data. Fortunately the generate_series function has an option to increment by arbitrary amounts, so simulating a has function for distribution is pretty easy. Here’s what that looks like:
INSERT INTO shard_1.sensor_log (location, reading, reading_date) SELECT s.id % 1000, s.id % 100, now() - (s.id || 's')::INTERVAL FROM generate_series(1, 4000000, 4) s(id); INSERT INTO shard_2.sensor_log (location, reading, reading_date) SELECT s.id % 1000, s.id % 100, now() - (s.id || 's')::INTERVAL FROM generate_series(2, 4000000, 4) s(id); INSERT INTO shard_3.sensor_log (location, reading, reading_date) SELECT s.id % 1000, s.id % 100, now() - (s.id || 's')::INTERVAL FROM generate_series(3, 4000000, 4) s(id); INSERT INTO shard_4.sensor_log (location, reading, reading_date) SELECT s.id % 1000, s.id % 100, now() - (s.id || 's')::INTERVAL FROM generate_series(4, 4000000, 4) s(id); |
Clearly a real sharding scenario would have a lot more involved in distributing the data. But this is poor-man’s parallelism, so it’s only appropriate to have a bunch of lazy shards to go with it.
In any case, we’re ready to query these tables. The way we generated the data, each table contains a million rows representing about six weeks of entries. A not infrequent use case for this structure is checking various time periods distributed across multiple days. That’s why we created the index on the TIME portion of our reading_date column.
If for example, we wanted to examine how 2PM looked across all of our data, we would do something like this:
\timing ON SELECT COUNT(*) FROM public.sensor_log WHERE reading_date::TIME >= '14:00' AND reading_date::TIME < '15:00'; TIME: 1215.589 ms SELECT COUNT(*) FROM shard_1.sensor_log WHERE reading_date::TIME >= '14:00' AND reading_date::TIME < '15:00'; TIME: 265.620 ms |
The second run with just one partition is included to give some insight at how fast the query could be if all four partitions could be checked at once. Here’s where the pmpp extension comes into play. It lets us send as many queries as we want in parallel, and pulls the results as they complete. Each query can be set to a different connection, too.
For the sake of simplicity, we’ll just simulate the remote connections with a local loopback to the database where we created all of the shards. In a more advanced scenario, we would be using at least two Postgres instances on potentially separate servers.
Prepare to be amazed!
CREATE EXTENSION postgres_fdw;
CREATE EXTENSION pmpp;
CREATE SERVER loopback
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', dbname 'postgres', port '5433');
CREATE USER MAPPING
FOR postgres
SERVER loopback
OPTIONS (USER 'postgres', password 'test');
\timing ON
CREATE TEMP TABLE tmp_foo (total INT);
SELECT SUM(total) FROM pmpp.distribute( NULL::tmp_foo, 'loopback',
array[
'SELECT count(*) FROM shard_1.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT count(*) FROM shard_2.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT count(*) FROM shard_3.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT count(*) FROM shard_4.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00'''
]
);
TIME: 349.503 ms |
Nice, eh? With a bit more “wrapping” to hide the ugliness of broadcasting a query to multiple servers, this has some major potential! Since we separated the shards by schema, we could even bootstrap the connections so the same query could be sent to each without any further modifications. Of course, the Postgres foreign data wrapper doesn’t let us set the schema for created servers, so we’d need another workaround like pg_service.conf, but the components are there.
The primary caveat is that this approach works best for queries that drastically reduce the query set using aggregates. The problem is that Postgres needs to run the query on each system, fetch the results into a temporary structure, and then return it again from the distribute() function. This means there’s an inversely proportional relationship between row count and speed; there’s a lot of overhead involved.
Take a look at what happens when we try to run the aggregate locally:
\timing ON
SELECT COUNT(*) FROM pmpp.distribute( NULL::sensor_log, 'loopback',
array[
'SELECT * FROM shard_1.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT * FROM shard_2.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT * FROM shard_3.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00''',
'SELECT * FROM shard_4.sensor_log
WHERE reading_date::TIME >= ''14:00''
AND reading_date::TIME < ''15:00'''
]
);
TIME: 4059.308 ms |
Basically, do anything possible to reduce the result set on queries. Perform aggregation remotely if possible, and don’t be surprised if pulling back tens of thousands of rows takes a bit longer than expected.
Given that caveat, this is a very powerful extension. It’s still very early in its development cycle, and could use some more functionality, but the core is definitely there. Now please excuse me while I contemplate ways to glue it to my shard_manager extension and methods of wrapping it to simplify invocation.
Além do estilo, o que mais impressiona em Desesperados, de Paula Fox, é a morte da civilidade. E como a história se assemelha ao que presenciamos em nosso cotidiano. Acompanhar as descrições do Brooklyn, na Nova York da década de 1960 (o livro foi lançado em 1970), e dos personagens que se sentem restringidos em sua liberdade, em sua segurança, provoca desagradável analogia, por exemplo, com o que experimento em São Paulo.
Sophie e Otto, os protagonistas, vivem uma situação econômica estável e são pessoas educadas, sensíveis. O casal encontra-se ilhado em seu mundo de cultura e pequenos luxos, mas o cerco se fecha, lentamente: os valores se confundem — uma funcionária do escritório de Otto é apontada como racista por pedir a um cliente negro que apague o cigarro no cinzeiro e não no carpete; os direitos individuais deixam de ser uma prerrogativa de todos e se transformam em trincheiras contra os “brancos
instruídos”; atos de vandalismo se repetem, a violência se banaliza; e a própria comunicação parece impossível, pois as conversas do dia-a-dia são repletas de gírias ou lugares-comuns, e também de um esquerdismo rasteiro, útil para desculpar comportamentos imorais, plenamente injustificáveis.
Um crítico marxista, em sua visão superficial e poluída da pior ideologia, diria que Sophie e Otto pagam o preço da “proletarização da sociedade”. A questão, contudo, é mais profunda. Há uma sistemática relativização dos valores, uma evidente decadência do espírito. Mesmo alguns amigos do casal se contaminam, tornando-se cínicos ou superficiais.
Não que Sophie e Otto sejam puros, inatacáveis, mas testemunham a morte da civilidade — até mesmo daquela mínima polidez que garante o convívio social. As ruas estão sempre imundas, o morador de um cortiço vizinho urina pela janela. A conclusão de Sophie, ao recordar o período em que traiu Otto, serve para definir o que ela e seu marido percebem cada vez mais: “Tiquetaqueando dentro da carapaça de vida normal e de seus acordos rudimentares estava a anarquia”.
Repetindo o que já afirmei, as comparações são inevitáveis — e os moradores das grandes cidades brasileiras, e que ainda mantêm um mínimo de lucidez, se identificarão com os protagonistas. Sensação, aliás, reforçada pela linguagem da autora, precisa e cruciante, como neste trecho, no qual o narrador descreve a sala de espera de um pronto-socorro:
Era como uma estação de ônibus, como um pátio abandonado, como os corredores dos vagões dos velhos trens B. & O., como plataformas de metrô, como delegacias. Combinava a qualidade transitória, a atmosfera de um terminal público com o terror imediatamente apreendido de uma antecâmara para o desastre.
Era um buraco morto, com cheiro de couro sintético e desinfetante, dois cheiros que pareciam emanar da cobertura rasgada e arranhada dos bancos encostados nas três paredes. Cheirava a cinza de tabaco que enchia os dois cinzeiros metálicos de coluna. Na beira cromada de um deles, um toco de charuto brilhava, úmido, como um pedaço de bife mastigado. Havia um cheiro de casca de amendoim e dos papéis encerados de caramelos jogados pelo chão, cheiro de jornal velho, seco, manchado de tinta, sufocante e ligeiramente semelhante a urinol, cheiro de suor de axilas, virilhas, costas, rostos, porejando e secando no ar sem vida, cheiro de roupa – fluidos de limpeza embebidos em tecido e desabrochando horrivelmente no ar quente e adocicado, que grudava nas narinas como espinhos – toda a exsudação da carne humana, o buquê do ser animal, emanando, secando, deixando um odor de desespero peculiar e inerradicável na sala, como se a química se transformasse em espírito, uma ascensão de algum tipo.
A partir de uma ocorrência tola, a mordida de um gato e alguns arranhões, a história se desencadeia, lançando Sophie e Otto no centro de um mundo onde não há espaço para a cordialidade ou a tolerância. Uma civilização sem propósito, impregnada de ódio e rudeza. O nosso mundo.
The post Paula Fox e a morte da civilidade appeared first on Rodrigo Gurgel.

As promised earlier this year, Pivotal released the code for Greenplum Database into Open Source.
Greenplum Database is based on PostgreSQL (was forked from PG 8.2), and features a massive parallel processing system (MPP) to run SQL queries on very large data sets. The code base is licensed under the Apache 2.0 license, and available on GitHub. You can fork the project from there, or submit patches and new features.
One of the main goals of the engineering team is to merge the existing code base with a recent PostgreSQL version. Although many features from newer PostgreSQL versions made it into Greenplum, there are many differences in terms of code. Also Greenplum offers unique features (new query optimizer, SQL support for partitioning, append-optimized tables, columnar storage, storage compression and many more), which over time will be ported to PostgreSQL and submitted for community review.
Most of the development will move into the public (except some internal customer related work), and will be managed using newly created mailinglists on the greenplum.org website.
<iframe width="480" height="315" src="https://www.youtube.com/embed/JRp5f933zT0" frameborder="0" allowfullscreen></iframe>"The program we are setting up right now is of course about delivering the aircraft as soon as possible to the Brazilian Air Force, and also about starting technology transfer to Brazil," says Mikael Franzén, Program Manager for Gripen Brazil.
The technology transfer program was one of the top reasons for the selection of Gripen for the FX2 requirement. The program has been designed to contribute to the development of an independent, advanced defence industrial base in Brazil.
The technology transfer is divided into approximately 50 Transfer of Technology projects. The first group of Brazilian engineers and technicians (46 employees from Embraer and 2 from AEL) arrived in Sweden this month for on-the-job training. Over time, 350 Brazilian engineers will be coming to Sweden for training programs of 2 weeks to 2 years.
According to Saab, Brazilian industry will be responsible for developing a big part of some of the Gripen systems, including of the two-seat version.
