Shared posts

08 Dec 20:57

This Cat Is Dead

by Emily Alford on Jezebel, shared by Andrew Couts to Gizmodo

Lil Bub, the internet cat famous for challenging and redefining the world’s ideals of feline beauty, has died. She was eight.

Read more...

23 May 14:29

Running Streaming Jobs Once a Day For 10x Cost Savings

by Burak Yavuz and Tyson Condie

This is the sixth post in a multi-part series about how you can perform complex streaming analytics using Apache Spark.


Traditionally, when people think about streaming, terms such as “real-time,” “24/7,” or “always on” come to mind. You may have cases where data only arrives at fixed intervals. That is, data appears every hour or once a day. For these use cases, it is still beneficial to perform incremental processing on this data. However, it would be wasteful to keep a cluster up and running 24/7 just to perform a short amount of processing once a day.

Fortunately, by using the new Run Once trigger feature added to Structured Streaming in Spark 2.2, you will get all the benefits of the Catalyst Optimizer incrementalizing your workload, and the cost savings of not having an idle cluster lying around. In this post, we will examine how to employ triggers to accomplish both.

Triggers in Structured Streaming

In Structured Streaming, triggers are used to specify how often a streaming query should produce results. Once a trigger fires, Spark checks to see if there is new data available. If there is new data, then the query is executed incrementally on whatever has arrived since the last trigger. If there is no new data, then the stream sleeps until the next trigger fires.

The default behavior of Structured Streaming is to run with the lowest latency possible, so triggers fire as soon as the previous trigger finishes. For use cases with lower latency requirements, Structured Streaming supports a ProcessingTime trigger which will fire every user-provided interval, for example every minute.

While this is great, it still requires the cluster to remain running 24/7. In contrast, a RunOnce trigger will fire only once and then will stop the query. As we’ll see below, this lets you effectively utilize an external scheduling mechanism such as Databricks Jobs.

Triggers are specified when you start your streams.

# Load your Streaming DataFrame
sdf = spark.readStream.load(path="/in/path", format="json", schema=my_schema)
# Perform transformations and then write…
sdf.writeStream.trigger(once=True).start(path="/out/path", format="parquet")
import org.apache.spark.sql.streaming.Trigger

// Load your Streaming DataFrame
val sdf = spark.readStream.format("json").schema(my_schema).load("/in/path")
// Perform transformations and then write…
sdf.writeStream.trigger(Trigger.Once).format("parquet").start("/out/path")

Why Streaming and RunOnce is Better than Batch

You may ask, how is this different than simply running a batch job? Let’s go over the benefits of running Structured Streaming over a batch job.

Bookkeeping

When you’re running a batch job that performs incremental updates, you generally have to deal with figuring out what data is new, what you should process, and what you should not. Structured Streaming already does all this for you. In writing general streaming applications, you should only care about the business logic, and not the low-level bookkeeping.

Table Level Atomicity

The most important feature of a big data processing engine is how it can tolerate faults and failures. The ETL jobs may (in practice, often will) fail. If your job fails, then you need to ensure that the output of your job should be cleaned up, otherwise you will end up with duplicate or garbage data after the next successful run of your job.

While using Structured Streaming to write out a file-based table, Structured Streaming commits all files created by the job to a log after each successful trigger. When Spark reads back the table, it uses this log to figure out which files are valid. This ensures that garbage introduced by failures are not consumed by downstream applications.

Stateful Operations Across Runs

If your data pipeline has the possibility of generating duplicate records, but you would like exactly once semantics, how do you achieve that with a batch workload? With Structured Streaming, it’s as easy as setting a watermark and using dropDuplicates(). By configuring the watermark long enough to encompass several runs of your streaming job, you will make sure that you don’t get duplicate data across runs.

Cost Savings

Running a 24/7 streaming job is a costly ordeal. You may have use cases where latency of hours is acceptable, or data comes in hourly or daily. To get all the benefits of Structured Streaming described above, you may think you need to keep a cluster up and running all the time. But now, with the “execute once” trigger, you don’t need to!

At Databricks, we had a two stage data pipeline, consisting of one incremental job that would make the latest data available, and one job at the end of the day that processed the whole day’s worth of data, performed de-duplication, and overwrote the output of the incremental job. The second job would use considerably larger resources than the first job (4x), and would run much longer as well (3x). We were able to get rid of the second job in many of our pipelines that amounted to a 10x total cost savings. We were also able to clean up a lot of code in our codebase with the new execute once trigger. Those are cost savings that makes both financial and engineering managers happy!

Scheduling Runs with Databricks

Databricks’ Jobs scheduler allows users to schedule production jobs with a few simple clicks. Jobs scheduler is ideal for scheduling Structured Streaming jobs that run with the execute once trigger.

Screenshot of the Job Scheduler in Databricks

At Databricks, we use the Jobs scheduler to run all of our production jobs. As engineers, we ensure that the business logic within our ETL job is well tested. We upload our code to Databricks as a library, and we set up notebooks to set the configurations for the ETL job such as the input file directory. The rest is up to Databricks to manage clusters, schedule and execute the jobs, and Structured Streaming to figure out which files are new, and process incoming data. The end result is an end-to-end — from data origin to data warehouse, not only within Spark — exactly once data pipeline. Check out our documentation on how to best run Structured Streaming with Jobs.

Summary

In this blog post we introduced the new “execute once” trigger for Structured Streaming. While the execute once trigger resembles running a batch job, we discussed all the benefits it has over the batch job approach, specifically:

  • Managing all the bookkeeping of what data to process
  • Providing table level atomicity for ETL jobs to a file store
  • Ensuring stateful operations across runs of the job, which allow for easy de-duplication

In addition to all these benefits over batch processing, you also get the cost savings of not having an idle 24/7 cluster up and running for an irregular streaming job. The best of both worlds for batch and streaming processing are now under your fingertips.

Try Structured Streaming today in Databricks by signing up for a 14-day free trial .

Other parts of this blog series explain other benefits as well:

--

Try Databricks for free. Get started today.

The post Running Streaming Jobs Once a Day For 10x Cost Savings appeared first on Databricks.

15 Apr 14:07

Donald Trump’s shifting positions have everything to do with the last person he spoke with

by Tim Fernholz
U.S. Republican presidential nominee Donald Trump puts his hand to his ear as he speaks at a campaign rally in Pueblo, Colorado, U.S., October 3, 2016.

Barack Obama was known for his basketball prowess in the White House, but no president crosses up his opponents and allies more than Donald Trump.

Trump’s constantly shifting policies could be charitably seen as clever expediency, but more and more often, it appears that the president is simply repeating the last thing he heard from an adviser, lobbyist, or head of state.

Yesterday, for example, Trump, made a rapid about-face on two big economic issues. Having spent his campaign for president arguing that China has intentionally under-valued its currency and that the Federal Reserve is risking inflation with low interest rates, he abruptly reversed both of those positions in an interview with the Wall Street Journal. Now, China is not a manipulator, and Trump hinted that he may re-nominate Fed Chair Janet Yellen, who is seen as too dovish by many conservatives.

These positions aren’t controversial among experts—China has been trying to keep its currency up, not down, and there’s an active debate over whether the Fed should wait before it hikes interest rates. But how did Trump come to conclusions diametrically opposed to his campaign trail promises?

Here’s a hint: Treasury secretary Steve Mnuchin sat in on the interview. A former Goldman Sachs partner, Mnuchin is among the aides viewed as having a “globalist” or influence by the hardcore conservative nationalists also populating the White House.

This isn’t the only time a Trump realization has been prompted by a nearby interlocutor. In February, he hosted a White House summit with health insurance executives. Trump, who confidently told voters “nobody knows the system better than me, which is why I alone can fix it,” emerged from the meeting a changed man. “Now, I have to tell you, it’s an unbelievably complex subject,” he said. “Nobody knew that health care could be so complicated.”

Perhaps the most worrisome aspect of this on-the-job education came in a startling anecdote about Trump’s efforts to restrain North Korea’s burgeoning and belligerent nuclear program. US diplomats have worked for decades with China in an attempt to influence the isolated dictatorship, with little success.

Trump, in his meetings with Chinese president Xi Jinping last week, expressed his view that China should be able to swiftly get North Korea back on the straight and narrow path. Xi quickly straightened Trump out by explaining the history of the region, Trump told the Journal.

“After listening for 10 minutes, I realized it’s not so easy,” Trump told the Journal. “I felt pretty strongly that they had a tremendous power…but it’s not what you would think.”

It’s far from clear how long any of these ideas will remain fixed in the president’s mind. But for a sneak peak into the president’s thoughts, just take a look at who he spoke with most recently.

02 Apr 10:49

China has an irrational fear of a “black invasion” bringing drugs, crime, and interracial marriage

by Joanna Chiu

Beijing

Earlier this month in Beijing, amid the pomp of China’s annual rubber-stamp parliament meetings, a politician proudly shared with reporters his proposal on how to “solve the problem of the black population in Guangdong.” The latter province is widely known in China to have many African migrants.

“Africans bring many security risks,” Pan Qinglin told local media (link in Chinese). As a member of the Chinese People’s Political Consultative Conference, the nation’s top political advisory body, he urged the government to “strictly control the African people living in Guangdong and other places.”

Pan, who lives in Tianjin near Beijing—and nowhere near Guangdong—held his proposal aloft for reporters to see. It read in part (links in Chinese):

“Black brothers often travel in droves; they are out at night out on the streets, nightclubs, and remote areas. They engage in drug trafficking, harassment of women, and fighting, which seriously disturbs law and order in Guangzhou… Africans have a high rate of AIDS and the Ebola virus that can be transmitted via body fluids… If their population [keeps growing], China will change from a nation-state to an immigration country, from a yellow country to a black-and-yellow country.”

On social media, the Chinese response has been overwhelmingly supportive, with many commenters echoing Pan’s fears. In a forum dedicated to discussions about black people in Guangdong on Baidu Tieba—an online community focused on internet search results—many participants agreed that China was facing a “black invasion.” One commenter called on Chinese people (link in Chinese) not to let “thousands of years of Chinese blood become polluted.”

The stream of racist vitriol online makes the infamous Chinese TV ad for Qiaobi laundry detergent, which went viral last year, seem mild in comparison. The ad featured a Asian woman stuffing a black man into a washing machine to turn him into a pale-skinned Asian man.

Not about reality

Of course, while a growing number of Africans work and study in China—the African continent’s largest trading partner—the notion that black people are “taking over” the world’s most populous nation is nonsense. Estimates for the number of sub-Saharan Africans in Guangzhou (nicknamed “Chocolate City” in Chinese) range from 150,000 long-term residents, according to 2014 government statistics, to as high as 300,000—figures complicated by the number of Africans coming in and out of the country as well as those who overstay their visas.

Many of them partner with Chinese firms to run factories, warehouses, and export operations. Others are leaving China and telling their compatriots not to go due to financial challenges and racism.

“Guangdong has come to be imagined to embody this racial crisis of some kind of ‘black invasion,'” said Kevin Carrico, a lecturer at Macquarie University in Australia who studies race and nationalism in China. “But this is not about actually existing realities.” He continued:

“It isn’t so much that they dislike black residents as they dislike what they imagine about black residents. The types of discourses you see on social media sites are quite repetitive—black men raping Chinese women, black men having consensual sex with Chinese women and then leaving them, blacks as drug users and thieves destroying Chinese neighborhoods. People are living in a society that is changing rapidly. ‘The blacks’ has become a projection point for all these anxieties in society.”

The past year or so has seen heated debate among black people living in China about what locals think of them. In interviews with Quartz, black residents referred to online comments and racist ads as more extreme examples, but said they are symptomatic of broader underlying attitudes.

Senegalese journalist Madeleine Thiam in Beijing.

Senegalese journalist Madeleine Thiam in Beijing.

Madeleine Thiam and Christelle Mbaya, Senegalese journalists at a Chinese international radio broadcaster in Beijing, said they are saddened but not shocked when they are discriminated against in China.

“Sometimes people pinch their noses as I walk by, as if they think I smell. On the subway, people often leave empty seats next to me or change seats when I sit down,” said Thiam. “Women have come up to rub my skin, asking if it is ‘dirt’ and if I’ve had a shower.”

Yet on a recent coffee break most passersby politely admired the fashionable women as if they were going down a catwalk.

One Chinese man, gazing at Thiam in her purple lace blouse and a yellow dress flaring around her hips, let out an admiring “wow” as the elevator doors opened to a third-floor café. Servers greeted their regulars with warm smiles and asked them in English, “How are you?”

Racism or ignorance?

Such experiences speak to the duality of life for black people in China. They may be athletes, entrepreneurs, traders, designers, or graduate students. Some are married to locals and speak fluent Chinese. Yet despite positive experiences and economic opportunities, many are questioning why they live in a place where they often feel unwelcome.

They grapple with the question: Is it racism or ignorance? And how do you distinguish the two?

Paolo Cesar, an African-Brazilian who has worked as a musician in Shanghai for 18 years and has a Chinese wife, said music has been a great way for him to connect with audiences and make local friends. However, his mixed-race son often comes home unhappy because of bullying at school. Despite speaking fluent Mandarin, his classmates do not accept him as Chinese. They like to shout out, “He’s so dark!”

The global success of black public figures, such as politicians, actors, and athletes, appears to have a limited effect on Chinese attitudes.

“People would say to me, ‘Obama! You’re a black American!’ And I’d be treated better than my African friends,” said Jayne Jeje, a marketing consultant from Maryland who has worked all over mainland China and now lives in Hong Kong. “I think it’s a class thing. If you’re African you’re from a poor place and should be treated with less deference, but if you’re African-American, that’s great, and you get some grudging respect.”

In response to international criticism of racism against blacks in China, some commentators have argued that the racism is not as serious as it is in other countries. Hong Kong columnist Alex Lo wrote in the South China Morning Post that criticism from Americans is “rich coming from a country that was founded on black slavery… China has racial problems. But murderous racism against blacks isn’t one of them.”

And of course racial tensions occur elsewhere, sometimes with ethnic Chinese as the victims. In France this week, Chinese protesters gathered in northeast Paris to protest the shooting of a Chinese man by police. Many complain of racism directed against them, and also of being targeted by gangs (video) of North African descent.

Looking deeper into history, evidence suggests a preference for slaves from East Africa in ancient China. African slavery in the country peaked during the Tang (618 to 907) and Song (960 to 1279) dynasties.

More recently, violence broke out after the Chinese government started providing scholarships allowing African students to study in the country in the 1960s. Many Chinese students resented the stipends Africans received, with tensions culminating in riots in Nanjing in the late 1980s. The riots began with angry Chinese students surrounding African students’ dormitories in Hehai University and pelting them with rocks and bottles for seven hours, with crowds later marching through the streets shouting anti-African slogans.

In the past few years, loathing among some Chinese toward foreign men who date local women has led to a recent rise in violent attacks against foreigners.

Staying optimistic

Yet most respondents Quartz interviewed remain optimistic. Vladimir Emilien, a 26-year-old African-American actor and former varsity athlete, said that for him, learning Chinese was crucial to better interactions with locals. Emilien volunteered last year as a coach teaching Beijing youth the finer points of American football. He said that once he was able to have more complex conversations in Chinese, he was struck by the thoughtful questions locals would ask.

African-American expat Vladimir Emilien volunteering as a football coach in Beijing.

Go deep.

“They’d say, What do you think about Chinese perception of black people? How does that make you feel?’ So they are aware that there is a lot of negativity around blacks and against Africa as a very poor place.”

Emilien hopes that more interactions between Chinese and black individuals will smooth out misunderstandings. But others say that improving relations requires more than black people learning the language, since that shifts responsibility away from the Chinese.

“The government has never done anything serious to clean up racist ideas created and populated by the [turn-of-the-20th-century] intellectuals and politicians that constructed a global racial hierarchy in which the whites were on the top, Chinese the second, and blacks the bottom,” said Cheng Yinghong, a history professor at Delaware State University who researches nationalism and discourse of race in China.

Instead of addressing discrimination, the Chinese government has focused on promoting cultural exchanges while pursuing economic partnerships with African countries. However, many have pointed out that relationships appear unbalanced, with China taking Africa’s limited natural resources in exchange for infrastructure investment.

“Racism is racism, period, and although some people would say that in different places it is more explicit, nuanced, or implicit, as long as there are victims we have to call it racism and deal with it,” said Adams Bodomo, a professor of African studies focused on cross-cultural communication at the University of Vienna. “China can’t be the second-largest economy in the world and not expect to deal with these issues.”

You can follow Joanna on Twitter at @joannachiu.

03 Nov 07:26

Bolivia's Witch Market in La Paz, Bolivia

Bolivia's Witch Market

Located on Calle Jiminez and Linares between Sagarnaga and Santa Cruz in, it's impossible to miss the Witches' Market of La Paz, Bolivia, which is found right in a lively tourist area. Dozens of vendors line the streets to sell a number of strange and fascinating products and the raw ingredients used in rituals to call on the spirits that populate the Aymara world.

Among the many items sold at the market are dried llama fetuses that are said to bring both prosperity and good luck, dried frogs used for Aymara rituals, soapstone figurines, aphrodisiac formulas, owl feathers, dried turtles and snakes, herbs, and folk remedies. Witch doctors in dark hats and dresses wander through the market offering fortune-telling services.

The dried llama fetuses are the most prominent product available at the market. These animals are fairly large and are used throughout the country, buried in the foundations of new buildings as an offering to the goddess Pachamama. It is believed that the buried llama fetuses keep construction workers safe, but these are only used by poor Bolivians. Wealthy Bolivians usually sacrifice a living llama to Pachamama.

05 May 11:24

We Finally Know Why DC Had an Earthquake in 2011

by Maddie Stone

Californians may scoff at the idea of a mid-sized earthquake, but when our nation’s capitol started shaking on August 23, 2011, people freaked out. Having grown up in the DC metro area, I can tell you why: we don’t get noticeable earthquakes. We are seismically boring and perfectly cool with it.

Read more...

23 Mar 22:18

Turn Your Roomba Into the Star Wars Droid You Already Pretend It Is

by Andrew Liszewski

iRobot’s Roomba vacuum cleaners are about the closest thing you can get to having a real Star Wars droid at home. In fact, many Roomba owners are happy to pretend their robovac is just a shorter version of R2-D2 while it works away, and this decal set will help make that even more believable.

Read more...











11 Aug 21:36

5 Random Things I Did This Weekend

by DC Rainmaker

Here’s the rundown of what I was up to over the last few days, a bit of a calm weekend in comparison.

1) Friday Night Date Run

We started off the weekend with a bit of a (rather hot) run down towards the Eiffel Tower and back.  Sorta one of our mainstay runs along the river, and a nice casual run together seemed like a good idea.

IMG_0090

As is often the case I was testing a few different things at once, trying to nail down HR accuracy tests for various optical sensors.

IMG_0112

With most of the Parisian locals gone for August, we were left with primarily just tourists to run around.  The lack of Parisians is most notable in areas you’d typically find higher concentrations of Parisians (i.e. Les Berges), whereas in the tourist heavy areas those are definitely still flooded (such as the Eiffel Tower).

IMG_0088

As is often the case, we took a few selfies along the way.  Unfortunately, none of them came out due to being a bit too dark.  So instead, here’s a picture of sunset:

IMG_0086

All in a touch over 10KM, nothing too fancy or extreme – but a good pre-dinner jaunt.

2) More power meter testing

IMG_0131

Despite having two different power meter analysis posts last week, I’m full steam ahead on both new power meters, and slightly more aged power meters that have been in the queue.  On the ‘new’ front I’ve got both the PowerTap C1 and bePRO units on the Cervelo doing testing.

IMG_0128

While in the ‘things that I haven’t yet written about’ front, I’ve got the Verve Infocrank long overdue for its review to be published this week.  I’ve had that data and photos done for months, just been trying to get through products that folks have shown/demanded more interest in.  So while there is certainly more riding to be done on the PowerTap C1 and bePRO pedals side, I can at least push out the Infocrank review.  The simple goal being to clear out all power meter reviews prior to Eurobike at the end of the month.

Speaking of those two that I did ride with this weekend.  Things are definitely looking much better than last week – both in terms of the offset (which I believe the Edge 810 was actually at fault there due to GPS time not being quite correct), as well as the bePro having settled a bit more.  Once I corrected the offsets of last week, it was still showing a fair bit low.

However this week, it seems to be more in the game, perhaps a itty-bitty-bit high (a few watts more than I’d expect), but nothing too alarming.  There are many power meters on the market that require some settling in time, and this might just be the case here as well.

image

You can see when I zoom in, things look pretty good in terms of matching trends (with the caveat as noted that it’s a tiny bit high).  Do be mindful to note the scale on these graphs is different than above.

image

image

A few little sections I need to account for, but I’ll see how things go this week.  Though, I suspect my main concerns (if any) around the bePRO will actually be durability.  I’m not convinced the pods would last through a winter of use with sand, rain, and other oils on the road.  Pretty summer-time riding is easy.  But winter?  That’s a whole different ballgame.  Hopefully by the end of the month I’ll have a clearer idea of how things are holding up and trending.

3) Recovery from the Giveaway Extravaganza:

Seriously, sleep.

This week was exhausting.  It’s semi-interesting looking at the graphs from the Withings Aura*.   Partly because it felt like I got far more sleep than I actually do.  And partly because the weekend was still broken up in semi-wonky ways.

As is often the case I’ll read before falling asleep, usually just the news and Twitter, which you see below.  However, what’s not captured is falling asleep on the couch at the office for 2-3 hours Saturday afternoon/evening.  That couch does not have sensors in it.

image

The same is true here, reading for a while.  But, the large gap is when we had to go do an early morning Cupcakery delivery.  So you see no data while I was gone.  Then I got back, answered some e-mail for a short bit and then fell asleep for a few hours before lounging around for another 30 minutes or so.  Of course, all of this somehow still totaled only 6hr and 53 mins.

image

(*I know, you probably want a review on it. Perhaps some day.  Today is not that day. Nor this week. It was something I bought on a curiosity last year, and it’s steadily got better since. Maybe I’ll write a review on it once they enable the ability to pair it as a standard Bluetooth audio speaker to my phone.  Otherwise, it’s just the worlds biggest bedside speaker that you can’t pair.)

4) Sunday Evening Run

VIRB0062

We were back out running together Sunday night before dinner, with another loop down to the Eiffel Tower, though this time in reverse.

VIRB0076

Thankfully the temps were a bit cooler this time, and also a fair bit less humid.  Woot to lower humidity!  Otherwise, pretty much just a nice easy and normal run.

5) A very short jaunt to dinner

After our run we headed out to try a new dinner spot.  Well, it wasn’t new, rather, it was the first time we’d try it.  Made all the more funny in the fact that it’s all of perhaps a 50m walk (line of sight it’d be perhaps 25m from our stairs if we didn’t have to walk around the building’s side).  Of course, both it and us have been here three years.

IMG_0148

While it may sound odd we hadn’t gone there yet, there are at rough count 14 restaurants within a 100m radius of our front door. Some of them fantastic, some of them so-so, and some of them downright horrible.  We walk past almost all of them at least 1-2 times a day (if not 4-5 times a day), so you eventually get to see many of the dishes they serve.  But this ones has less of a front, so we didn’t usually see as many dishes.

In any case, the menu was awesome.  They had both a tasting menu as well as a regular menu.  The tasting menu simply being a higher quantity of smaller portioned dishes off of the main menu.  Fantastic food.

IMG_0149

Definitely on our list to go back again, even more so since they were friendly and knew a fair bit about the food.  A perfect way to end the weekend.  Plus, the 30 second commute back to the house was kinda nice too.

Thanks for reading!