14 Jan 14:01

Trash Bird’s Success

by Reza

Alois Mahdal, BillyHoya and 2 others like this

12 Jan 22:15

Saturday Morning Breakfast Cereal - Ark

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
There's actually a whole 'nother page on the other side.

Today's News:

Hey! Me and Kelly are working on a project, and we started a little twitter account where we post Weird Stories from Space.

Roslyn , Jacopo.bertolotti and one other like this

10 Jan 13:30

10 ML & NLP Research Highlights of 2019

by Sebastian Ruder

This post gathers ten ML and NLP research directions that I found exciting and impactful in 2019.

For each highlight, I summarise the main advances that took place this year, briefly state why I think it is important, and provide a short outlook to the future.

The full list of highlights is here:

Universal unsupervised pretraining
Lottery tickets
The Neural Tangent Kernel
Unsupervised multilingual learning
More robust benchmarks
ML and NLP for science
Fixing decoding errors in NLG
Augmenting pretrained models
Efficient and long-range Transformers
More reliable analysis methods

1) Universal unsupervised pretraining

What happened? Unsupervised pretraining was prevalent in NLP this year, mainly driven by BERT (Devlin et al., 2019) and other variants. A whole range of BERT variants have been applied to multimodal settings, mostly involving images and videos together with text (for an example see the figure below). Unsupervised pretraining has also made inroads into domains where supervision had previously reigned supreme. In biology, Transformer language models have been pretrained on protein sequences (Rives et al., 2019). In computer vision, approaches leveraged self-supervision including CPC (Hénaff et al., 2019), MoCo (He et al., 2019), and PIRL (Misra & van der Maaten, 2019) as well as strong generators such as BigBiGAN (Donahue & Simonyan, 2019) to improve sample efficiency on ImageNet and image generation. In speech, representations learned with a multi-layer CNN (Schneider et al., 2019) or bidirectional CPC (Kawakami et al., 2019) outperform state-of-the-art models with much less training data.

Why is it important? Unsupervised pretraining enables training models with much fewer labelled examples. This opens up new applications in many different domains where data requirements were previously prohibitive.

What's next? Unsupervised pretraining is here to stay. While the biggest advances have been achieved so far in individual domains, it will be interesting to see a focus towards tighter integration of multiple modalities.

2) Lottery tickets

What happened? Frankle and Carbin (2019) identified winning tickets, subnetworks in dense, randomly-initialised, feed-forward networks that are so well initialised that training them in isolation achieves similar accuracy to training the full network, as can be seen below. While the initial pruning procedure only worked on small vision tasks, later work (Frankle et al., 2019) applied the pruning early in training instead of at initialisation, which makes it possible to find small subnetworks of deeper models. Yu et al. (2019) find winning ticket initialisations also for LSTMs and Transformers in NLP and RL models. While winning tickets are still expensive to find, it is promising that they seem to be transferable across datasets and optimisers (Morcos et al., 2019) and domains (Desai et al., 2019).

Why is it important? State-of-the-art neural networks are getting larger and more expensive to train and to use for prediction. Being able to consistently identify small subnetworks that achieve comparable performance enables training and inference with much fewer resources. This can speed up model iteration and opens up new applications in on-device and edge computing.

What's next? Identifying winning tickets is currently still too expensive to provide real benefits in low-resource settings. More robust one-shot pruning methods that are less susceptible to noise in the pruning process should mitigate this. Investigating what makes winning tickets special should also help us gain a better understanding of the initialisation and learning dynamics of neural networks.

3) The Neural Tangent Kernel

What happened? Somewhat counter-intuitively, very wide (more concretely, infinitely wide) neural networks are easier to study theoretically than narrow ones. It has been shown that in the infinite-width limit, neural networks can be approximated as linear models with a kernel, the Neural Tangent Kernel (NTK; Jacot et al., 2018). Refer to this post for an intuitive explanation of NTK including an illustration of its training dynamics (see the figure below). In practice, such models, have underperformed their finite-depth counterparts (Novak et al., 2019; Allen-Zhu et al., 2019; Bietti & Mairal, 2019), which limits applying the findings to standard methods. Recent work (Li et al., 2019; Arora et al., 2019), however, has significantly reduced the performance gap to standard methods (see Chip Huyen's post for other related NeurIPS 2019 papers).

Why is it important? The NTK is perhaps the most powerful tool at our disposal to analyse the theoretical behaviour of neural networks. While it has its limitations, i.e. practical neural networks still perform better than their NTK counterparts, and insights so far have not translated into empirical gains, it may help us open the black box of deep learning.

What's next? The gap to standard methods seems to be mainly due to benefits of the finite width of such methods, which future work may seek to characterise. This will hopefully also help translating insights from the infinite-width limit to practical settings. Ultimately, the NTK may help us shed light on the training dynamics and generalisation behaviour of neural networks.

4) Unsupervised multilingual learning

What happened? Cross-lingual representations had mostly focused on the word level for many years (see this survey). Building on advances in unsupervised pretraining, this year saw the development of deep cross-lingual models such as multilingual BERT, XLM (Conneau & Lample, 2019), and XLM-R (Conneau et al., 2019). Even though these models do not use any explicit cross-lingual signal, they generalise surprisingly well across languages—even without a shared vocabulary or joint training (Artetxe et al., 2019; Karthikeyan et al., 2019; Wu et al., 2019). For an overview, have a look at this post. Such deep models also brought improvements in unsupervised MT (Song et al., 2019; Conneau & Lample, 2019), which hit its stride last year (see highlights of 2018) and saw improvements from a more principled combination of statistical and neural approaches (Artetxe et al., 2019). Another exciting development is the bootstrapping of deep multilingual models from readily available pretrained representations in English (Artetxe et al., 2019; Tran, 2020), which can be seen below.

Why is it important? Ready-to-use cross-lingual representations enable training of models with fewer examples for languages other than English. Furthermore, if labelled data in English is available, these methods enable essentially free zero-shot transfer. They may finally help us gain a better understanding of the relationships between different languages.

What's next? It is still unclear why these methods work so well without any cross-lingual supervision. Gaining a better understanding of how these methods work will likely enable us to design more powerful methods and may also reveal insights about the structure of different languages. In addition, we should not only focus on zero-shot transfer but also consider learning from few labelled examples in the target language (see this post).

5) More robust benchmarks

There is something rotten in the state of the art.
—Nie et al. (2019) paraphrasing Shakespeare

What happened? Recent NLP datasets such as HellaSWAG (Zellers et al., 2019) are created to be difficult for state-of-the-art models to solve. Examples are filtered by humans to explicitly retain only those where state-of-the-art models fail (see below for an example). This process of human-in-the-loop adversarial curation can be repeated multiple times such as in the recent Adversarial NLI (Nie et al., 2019) benchmark to enable the creation of datasets that are much more challenging for current methods.

Why is it important? Many researchers have observed that current NLP models do not learn what they are supposed to but instead adopt shallow heuristics and exploit superficial cues in the data (described as the Clever Hans moment). As datasets become more robust, we would hope that models will be forced to eventually learn the true underlying relations in the data.

What's next? As models become better, most datasets will need to be continuously improved or will quickly become outdated. Dedicated infrastructure and tools will be necessary to facilitate this process. In addition, appropriate baselines should be run including simple methods and models using different variants of the data (such as with incomplete input) so that initial versions of datasets are as robust as possible.

6) ML and NLP for science

What happened? There have been some major advances of ML being applied to fundamental science problems. My highlights were the application of deep neural networks to protein folding and to the Many-Electron Schrödinger Equation (Pfau et al., 2019). On the NLP side, it is exciting to see what impact even standard methods can have when combined with domain expertise. One study used word embeddings to analyse latent knowledge in the materials science literature (Tshitoyan et al., 2019), which can be used to predict which materials will have certain properties (see the figure below). In biology, a lot of data such as genes and proteins is sequential in nature. It is thus a natural fit for NLP methods such as LSTMs and Transformers, which have been applied to protein classification (Strodthoff et al., 2019; Rives et al., 2019).

Why is it important? Science is arguably one of the most impactful application domains for ML. Solutions can have a large impact on many other domains and can help solve real-world problems.

What's next? From modelling energy in physics problems (Greydanus et al., 2019) to solving differential equations (Lample & Charton, 2020), ML methods are constantly being applied to new applications in science. It will be interesting to see what the most impactful of these will be in 2020.

7) Fixing decoding errors in NLG

What happened? Despite ever more powerful models, natural language generation (NLG) models still frequently produce repetitions or gibberish as can be seen below. This was shown to be mainly a result of the maximum likelihood training. I was excited to see improvements that aim to ameliorate this and are orthogonal to advances in modelling. Such improvements came in the form of new sampling methods, such as nucleus sampling (Holtzman et al., 2019) and new loss functions (Welleck et al., 2019). Another surprising finding was that better search does not lead to better generations: Current models rely to some extent on an imperfect search and beam search errors. In contrast, an exact search most often returns an empty translation in the case of machine translation (Stahlberg & Byrne, 2019). This shows that advances in search and modelling must often go hand in hand.

Why is it important? Natural language generation is one of the most general tasks in NLP. In NLP and ML research, most papers focus on improving the model, while other parts of the pipeline are typically neglected. For NLG, it is important to remind ourselves that our models still have flaws and that it may be possible to improve the output by fixing the search or the training process.

What's next? Despite more powerful models and successful applications of transfer learning to NLG (Song et al., 2019; Wolf et al., 2019), model predictions still contain many artefacts. Identifying and understanding the causes of such artefacts is an important research direction.

8) Augmenting pretrained models

What happened? I was excited to see approaches that equip pretrained models with new capabilities. Some methods augment a pretrained model with a knowledge base in order to improve modelling of entity names (Liu et al., 2019) and the recall of facts (Logan et al., 2019). Others enable it to perform simple arithmetic reasoning (Andor et al., 2019) by giving it access to a number of predefined executable programs. As most models have a weak inductive bias and learn most of their knowledge from data, another way to extend a pretrained model is by augmenting the training data itself, e.g. to capture common sense (Bosselut et al., 2019) as can be seen below.

Why is it important? Models are becoming more powerful but there are many things that a model cannot learn from text alone. Particularly when dealing with more complex tasks, the available data may be too limited to learn explicit reasoning using facts or common sense and a stronger inductive bias may often be necessary.

What's next? As models are being applied to more challenging problems, it will increasingly become necessary for modifications to be compositional. In the future, we might combine powerful pretrained models with learnable compositional programs (Pierrot et al., 2019).

9) Efficient and long-range Transformers

What happened? This year saw several improvements to the Transformer (Vaswani et al., 2017) architecture. The Transformer-XL (Dai et al., 2019) and the Compressive Transformer (Rae et al., 2020), which can be seen below enable it to better capture long-range dependencies. Many approaches sought to make the Transformer more efficient mostly using different—often sparse—attention mechanisms, such as adaptively sparse attention (Correia et al., 2019), adaptive attention spans (Sukhbaatar et al., 2019), product-key attention (Lample et al., 2019), and locality-sensitive hashing (Kitaev et al., 2020). On the Transformer-based pretraining front, there have been more efficient variants such as ALBERT (Lan et al., 2020), which employs parameter sharing and ELECTRA (Clark et al., 2020), which uses a more efficient pretraining task. There were also more efficient pretrained models that did not utilise a Transformer, such as the unigram document model VAMPIRE (Gururangan et al., 2019) and the QRNN-based MultiFiT (Eisenschlos et al., 2019). Another trend was the distillation of large BERT models into smaller ones (Tang et al., 2019; Tsai et al., 2019; Sanh et al., 2019).

Why is it important? The Transformer architecture has been influential since its inception. It is a part of most state-of-the-art models in NLP and has been successfully applied to many other domains (see Sections 1 and 6). Any improvement to the Transformer architecture may thus have strong ripple effects.

What's next? It will take some time for these improvements to trickle down to the practitioner but given the prevalence and ease of use of pretrained models, more efficient alternatives will likely be adopted quickly. Overall, we will hopefully see a continuing focus on model architectures that emphasise efficiency, with sparsity being one of the key trends.

10) More reliable analysis methods

What happened? A key trend for me this year was the increasing number of papers analysing models. In fact, several of my favourite papers this year were such analysis papers. An early highlight was the excellent survey of analysis methods by Belinkov & Glass (2019). This year was also the first one (in my memory) where many papers were dedicated to analysing a single model, BERT (such papers are known as BERTology). In this context, probes (see the figure below), which aim to understand whether a model captures morphology, syntax, etc. by predicting certain properties have become a common tool. I particularly appreciated papers that make probes more reliable (Liu et al., 2019; Hewitt & Liang, 2019). Reliability is also a theme in the ongoing conversation on whether attention provides meaningful explanations (Jain & Wallace, 2019; Wiegreffe & Pinter, 2019; Wallace, 2019). The continuing interest in analysis methods is perhaps best exemplified by the new ACL 2020 track on Interpretability and Analysis of Models in NLP.

Why is it important? State-of-the-art methods are used as black boxes. In order to develop better models and to use them in the real world, we need to understand why models make certain decisions. However, our current methods to explain models' predictions are still limited.

What's next? We need more work on explaining predictions that goes beyond visualisation, which is often unreliable. An important trend in this direction are human-written explanations that are being provided by more datasets (Camburu et al., 2018; Rajani et al., 2019; Nie et al., 2019).

View attached file (videobert-1.png, image/png)

01 Jan 20:18

"מאגר לוויתן לא יפותח"

by חנן עמיאור

לכבוד חיבור אסדת לוויתן והתחלת הזרמת הגז מהמאגר למשקי האנרגיה של ישראל, ירדן ומצרים, הנה מסע קצר בקמפיין העיקש שניהל עיתון 'הארץ דה מרקר' בשנים האחרונות נגד הוצאת הגז מהאדמה בתואנות שונות, משונות, מתעדכנות ומתחלפות:

הטענה בדה מרקר: "מאגר לוויתן לא יפותח"

המציאות: מאגר לוויתן פותח והחל להזרים גז לצנרת חברת הגז הלאומית

לאחר שהתברר שיפותח, התחלפה הטענה: אם יפותח, לא בלוחות הזמנים שנקבעו

המציאות: הופעל בדיוק בזמן שנקבע מראש

הטענה בדה מרקר: "ירדן ומצרים לא מתכוונות לייבא גז מישראל"

המציאות: ירדן ומצרים חותמות על חוזי רכישת גז ארוכי טווח מול ישראל

(לאחר שהתברר שמצרים כן תקנה גז מישראל, התחלפה הטענה: מכירת הגז למצרים "מפקירה את ביטחון ישראל")

הטענה בדה מרקר: ישראל תהיה תלויה בצינור גז אחד שיתחבר לחוף אשדוד ויהיה מאויים בקסאמים מרצועת עזה

המציאות: אל הצינור המתחבר לחוף אשדוד נוסף צינור שני המחבר את אסדת לוויתן לחוף דור

הטענה בדה מרקר: בלי מגבלות יצוא הגז ייגמר

המציאות: הגז מספיק ל 150 שנה. אם לא יופנה ליצוא יינטש

הטענה בדה מרקר: "בכל תרחיש, מחיר הגז בישראל יוסיף להאמיר בשנים הבאות באופן אוטומטי, ויהיה מהגבוהים בעולם"

המציאות: מחיר הגז בישראל נמוך בהשוואה בינלאומית. רק בארה"ב הגז זול מבישראל

הטענה בדה מרקר: המונופול יתעצם ולא תהיה תחרות

המציאות: אין מונופול וכבר כעת ניטשת תחרות ו"גניבת חוזים" בין שלושה גופים – נובל אנרג'י, ישראמקו ואנרג'יאן היוונית

הטענה בדה מרקר: הפעלת האסדה תיצור זיהום אוויר חמור

המציאות: האסדה הופעלה ולא נרשמו חריגות ברמת הזיהום

The post "מאגר לוויתן לא יפותח" appeared first on פרספקטיבה.

30 Dec 07:25

A New and Exhilarating Weapon.

by languagehat

Yuval Pinter
"They claimed that soldiers used the word ‘fucking’ so often that it was merely a warning ‘that a noun is coming’."

Bee Wilson’s 2018 LRB review of The Littlehampton Libels: A Miscarriage of Justice and a Mystery about Words in 1920s England by Christopher Hilliard (Oxford, June 2017) is full of all sorts of interesting things, as obviously is the book, but I want to quote some bits about language:

Edith Swan, a 30-year-old laundress from the seaside town of Littlehampton in Sussex, was accused of sending a letter to a sanitary inspector called Charles Gardner that contained words of ‘an indecent, obscene and grossly offensive character’. […]

The Littlehampton Libels by Christopher Hilliard is a short but dazzling work of microhistory. It uses the story of some poison pen letters in a small town to illuminate wider questions of social life in Britain between the wars, from ordinary people’s experience of the legal system to the way people washed their sheets, and is a far more exciting book than either the title or the rather dull cover would suggest. For a short period, the mystery of these letters became a national news story that generated four separate trials and, as Hilliard writes, ‘demanded more from the police and the lawyers than most murders’.

This is a book about morality and class, about the uses and abuses of literacy and about the tremendous dislocations in British society after the First World War, which extended far beyond those who had suffered the direct trauma of battle. Hilliard uses these poison pen letters – written in language that was as eccentric as it was obscene – to ‘catch the accents of the past’. The Littlehampton Libels is about a battle between two women who were members of only the second generation in Britain to benefit from compulsory elementary education, women for whom the written word was a new and exhilarating weapon.

Hilliard asks what it was like to live in a society where ‘nice’ women had to pretend that they were ignorant of all profanity. Melissa Mohr claims in her excellent book Holy Sh*t: A Brief History of Swearing (2013) that the British started to swear more during and after the First World War, because strong language – like strong drink – is a way to alleviate despair. In 1930, John Brophy and Eric Partridge published a collection of British songs and slang from the war. They claimed that soldiers used the word ‘fucking’ so often that it was merely a warning ‘that a noun is coming’. In a normal situation, swear words are used for emphasis, but Brophy and Partridge found that obscenity was so over-used among the military in the Great War that if a soldier wanted to express emotion he wouldn’t swear. ‘Thus if a sergeant said, “Get your —ing rifles!” it was understood as a matter of routine. But if he said, “Get your rifles!” there was an immediate implication of urgency and danger.’

As former soldiers re-established themselves as civilians, swearing became normalised, but it was only acceptable when used by men and addressed to men. The story of the Littlehampton libels reveals the extent to which British society at this time clung to certain beliefs about women and language. One of these prejudices, fiercely held, was that a ‘respectable’ woman was incapable of allowing a dirty word to sully her mouth. Another was that women who did swear were beyond the pale, and therefore capable of anything. The tenacity of these prejudices within the legal system would allow Edith Swan to send multiple poison pen letters to her neighbours over a period of three years and contrive to have a less ‘respectable’ woman – Rose Gooding – twice sent to jail for crimes of which she was entirely innocent.

Gooding (spoiler!) was eventually freed and exonerated:

In the end, Rose Gooding’s faulty spelling helped to save her. Inspector Nicholls painstakingly went through 27 letters that Rose sent to Bill from Portsmouth jail and found that she always misspelled the word ‘prison’ as ‘prision’. This was a mistake that the author of the libels never made (one of Edith Swan’s school teachers said she was ‘very clever at Essay writing, and a good penman’). Unlike Rose Gooding’s public exclamations of ‘bloody old cow’ and so on, which were easily copied by Edith Swan in the letters, ‘prision’ was a little quirk of Rose’s language that no one knew about except for Bill. On 25 July 1921, the Court of Criminal Appeals heard Rose’s case and overturned both of her convictions.

But it’s appalling what she had to go through because she seemed so clearly of a lower class than the butter-wouldn’t-melt-in-her-mouth Swan.

30 Dec 07:04

Libfix report for December 2019

by Kyle Gorman

A while ago I acquired a dictionary of English blends (Thurner 1993), and today I went through it looking for candidate libfixes I hadn’t yet recorded. Here are a few I found. From burlesque, we have –lesque, used to form both boylesque and girlesque. The kumquat gives rise to –quat. This is used in two (literal) hybrid fruits: citrangequat and limequat. From melancholy comes –choly, used to build solemncholy ‘a solemn or serious mood’ and the unglossable lemoncholy. From safari there is –fari, used to build seafari, surfari, and even snowfari. Documentary has given rise to –mentary, as in mockumentary and rockumentary.

An interesting case is that of –stache. While stache is a common clipping of mustache, it is commonly used as an affix as well, as in liquid-based beerstache and milkstache and the pejorative fuckstache and fuzzstache.

I also found a number of libfix-like elements that can plausibly be analyzed as affixes rather than cases of “liberation”. Some examples are –eteer (blacketeer, stocketeer), –legger (booklegger, meatlegger), and –logue (duologue, pianologue, travelogue). I do not think these are properly defined as libfixes (they are a bit like -giving) but I could be wrong.

References

D. Thurner (1993). The portmanteau dictionary: blend words in the English language, including trademarks and brand names. Jefferson, NC: MacFarland & Co.

27 Dec 08:07

Comic for December 23, 2019

Dilbert readers - Please visit Dilbert.com to read this feature. Due to changes with our feeds, we are now making this RSS feed a link to Dilbert.com.

25 Nov 16:21

Queer NLP - stand and be counted

TL;DR: I want marginalized people and communities to be recognized by those who design AI systems, and for those people’s needs to be clearly communicated and understood.

See the poster here!

I don’t talk a lot about being gay, because I’m privileged enough to work in a community where my identity isn’t a problem. But! It does make me look at the world differently, and I’ve recently started thinking about issues in natural language processing that intersect with LGBTQ issues. What are concrete problems related to language use that relate to LGBTQ people, and how can we start talking about them in a productive way?

AI research often focuses on building systems with a concrete task in mind (e.g. parsing sentences) rather than serving actual humans with complicated social needs, which often requires a participatory design approach with direct input from the people affected by such systems. In general, I think that we need a paradigm shift away from systems that address context-free problems and toward systems that serve people. There is no financial benefit for someone to make an AI system that serves marginalized people. However, helping marginalized people is how society moves forward, and we cannot pretend anymore that building systems purely for linguistic analysis or information retrieval will directly solve systemic inequalities in society. I am not the first person to think about this issue (e.g. Timnit Gebru, Margaret Mitchell and Kate Crawford have all done really impactful work about the role of AI in society), but I think that it’s time that for me to stand and be counted.

After submitting an unrelated paper in September, I spent about 6 hours brain-dumping my thoughts around the issue of LGBTQ people and NLP into an abstract, which was accepted to the NeurIPS Queer in AI workshop. I’m going to present a poster about “How natural language processing can (and should) serve LGBTQ people”, and I want to break it down in a blog post so it’s easier to share.

Mental health concerns

Mental health is a major concern among LGBTQ people, leading many to depression and suicide. Despite increasing social support among Americans, the social pressure of family and peers toward sexuality and gender identity can lead to high stress and feelings of isolation. Due to social stigma and a lack of offline support, many LGBTQ people turn to online discussion forums to discuss their feelings and to seek advice on navigating daily life.

While considerable work has been done to apply NLP to better quantify mental health discussions online, I would like to see NLP systems more directly applied to specific LGBTQ mental health discussions. For example, using topic models and structured language models, can we identify consistent coming-out narratives (in the same way as e.g. birth narratives) that can then be summarized to show LGBTQ forum members that they are not alone in their experiences? Furthermore, can we identify consistent “clusters” of LGBTQ narratives to provide to stakeholders such as mental health counselors to improve their training and make them feel more confident in addressing minority experiences? I see NLP as a useful part of an analytical toolkit to help counselors better understand the experiences of LGBTQ people.

Social harassment

Although LGBTQ people often turn to the internet for support, they also face violent harassment from others online. This is a huge problem on public-facing platforms such as YouTube, where LGBTQ people regularly face harassment that targets their identity expression, which is compounded by the platforms’ unfair policies with respect to content creators’ rights. Harassment often evades typical detection systems that focus on keywords (e.g. catching the slur “f**t”) through coded language or microaggressions that implicitly degrade people based on their identity (e.g. “lispy queer”).

As noted by a recent paper, these microaggressions are subtle in how they degrade people based on their identity (e.g. “my gay friend doesn’t have a problem with this show”). Researchers should test the limits of current contextual language models in detecting microaggressions toward LGBTQ people and furthermore assess the harm being done to people reading such comments. Providing further proof of the damage done through subtle examples of harassment can further justify development of systems to mitigate harassment.

Gender bias

NLP systems such as word embeddings are often found to contain bias toward female-associated words: a system trained on news text may make a flawed analogy of “man is to woman as doctor is to nurse”. While a recent wave of NLP research has focused on eliminating the female vs. male bias, there is still much to say about bias toward transgender people, as well as for people whose gender does not fit in the typical male/female binary. One issue that transgender people face is misgendering, which can happen in a system such as Google Translate where a person’s gender must be inferred by the system. In the Google Translate example below, the American actress Candis Cayne is assigned the masculine grammatical case (Spanish “alto” vs. “alta”), even when the following sentence uses a female pronoun.

google_translate_misgender

If we want to prevent gender bias in NLP systems, we should audit existing systems for possible bias against transgender people and allow for the possibility of not imposing explicit gender on the users.

Conversations first

In conversations with NLP researchers, I’m struck by how often people cite specific language patterns that made them excited to start working in a particular problem space: e.g. the “doctor”/”nurse” example above. I want more of those examples to come directly from conversations with affected people, so that researchers will remember that they serve people first and abstract problems second. Humans are not abstract mathematical representations to be optimized in a loss function.

I want researchers to want to help. This will mean providing space for LGBTQ people to make their voices heard to those who have the power to design AI systems, through work like Project Respect. These conversations will surely be difficult, but they will help us move toward a society where AI can serve everyone, not just the “typical” user. If LGBTQ people will not be heard, then we will be left behind.

25 Nov 02:14

כך תעשה

by יובל פינטר

ככל שעוברות עלי יותר שנים בחשיפה לעברית ולאנגלית, ההבדלים ביניהן אמורים פחות להפריע לי. אני כבר לא מצפה לשים לב לדברים שיצרמו לי באופן בלתי מוכר.

ועם זאת, בשנתי הרביעית בפרקי הנוכחי בניכר, אני שמח להכריז על צרימה ממבע באנגלית (אמריקאית, לפחות דרומית אבל נדמה לי שלא רק) שתפסה אותי יחסית בהפתעה. את הביטוי "have a nice day" לבטח לא צריך להסביר לקהל הקוראים, יש לו אפילו מקבילה עברית צולעת "שיהיה לך יום נעים". ההפתעה מגיעה כשאדם מאחל כך לזולתו ומקבל בתגובה

Thanks, you do the same.

גם אתם מופתעים לשניה? בפעם-פעמיים הראשונות חשבתי שאולי העונה לא ממש מרוכז, או לא דובר ילידי של אנגלית, אבל וואלה כבר שמעתי את זה עשרות פעמים.

מקור ההפתעה במעין דיסוננס שנובע מההבדל התחבירי בין הפעל "להיות" במובנו השייכותי בעברית אל מול have. אין פה תופעה בלתי-מוכרת בעברית, הרי כבר דנו בצורה הלא-תקנית-בעברית-תנ"כית "יש לי את הספר", בה מבחינה רשמית את קודם לנושא ולא למושא, והמסקנה המקובלת היא שעברית מושפעת משפות אירופאיות בהן התרגום למשפט הוא משהו מהמבנה I have the book, שם אין ספק שהקניין מופיע כמושא.

אז השפעה-השפעה, ובכל זאת you do the same צורם לי. איך היינו עונים באותה רוח לאיחול העברי? משהו כמו "שיהיה גם לך" (בלי "תודה", כמובן). משהו בתגובה הזאת מרגיש… פסיבי. האדם שיש לו יום נעים סוג-של "מחזיק" בו, היום הוטל עליו, שפר עליו גורלו. בגרסה האנגלית, לעומת זאת, מרגישים (אני לפחות) שצריך לעשות משהו כדי לשריין את אותו היום הנעים, הרי ממש אומרים שם do. כדובר עברית, זה לא נשמע לי כמו משהו שאפשר לעשות.

האם אני רומז כאן לפער תרבותי בו הלבנטין מצפה שהעולם יטיל עליו יום נעים ואילו האירופאי החרוץ עמל להשגת טובו של היום? אני חושב שההבדל נעוץ יותר קרוב לשפה. אנחנו אולי נתרגם את do ברוב המקרים ל-"לעשות", אבל זה פעל שמתפקד גם כפעל-עזר עבור דברים הרבה פחות מוחשיים. ספציפית do the same פשוט מחליף כל צירוף פעלי בשפה באותה מידה שכינויי-גוף כמו he, she, it מחליפים צירופים שמניים שידועים בהקשר. והמהדרין גם מוסיפים דוגמאות ללא כל מקבילה עברית כמו how do you do?. אז נדמה לי שמקור ההפתעה שלי הוא אותו זכר ישן לתצורת "יש" כפעל שמצפה דווקא לנושא בתור הדבר שישנו, ולכן לא מתאים להחליפו בביטוי שמחליף צירוף פעל המורכב מפעל והמושאים שלו.

שיהיה לכם יום נעים!

View attached file (2ad9ce4211a2025857267d38ca328e91?s=96&d=https%3A%2F%2F2.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&r=G, unknown)

11 Nov 18:38

כיבוש הנפש

by שלומי דסקל

"קולנוע של עם ללא מדינה, ללא שוק מוגדר, ללא מוסדות וקרנות לאומיים שיתמכו בו" • "יש עשייה, לא תעשייה, אבל מאוד קשה להגדיר אותה. זה תהליך שמתהווה. כל שנה יש יותר סרטים. יש דור חדש שרוצה לעשות קולנוע" • "אם אתה מסתכל על העולם הערבי, הסרטים הפלסטינים מובילים בפסטיבלים" • העיתונאי שאדי בלאן, מחבר הספר "סרט עלילתי פלסטיני", בשיחה על קולנוע פלסטיני

View attached file (shadi-picture-144x81.jpg, image/jpeg)

08 Nov 03:55

Saturday Morning Breakfast Cereal - Note

by tech@thehiveworks.com

Yuval Pinter
worst continuity ever, steering wheel.

Click here to go see the bonus panel!

Hovertext:
Really, you should just have a roll of these printed, just in case.

Today's News:

Tindalostalbot, Cary likes this

08 Nov 03:54

Saturday Morning Breakfast Cereal - Destroy

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
If you ever want to see some sheer brutality, watch an ecologist come across a nest of invasives.

Today's News:

Tindalostalbot, Cary likes this

13 Oct 18:01

Comic for 2019.10.12

New Cyanide and Happiness Comic

personal responsibility likes this

10 Oct 13:13

Trouble on the Roof

by Doug

Trouble on the Roof

And more kangaroos.

30 Sep 16:56

Saturday Morning Breakfast Cereal - Trading

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
I happen to be reading a book about the history of conspiracy theories today, so let me just say for the record that I don't believe a flaming ram's skull interns at high frequency trading firms.

Today's News:

Jacopo.bertolotti, Tindalostalbot likes this

23 Sep 15:42

Comic for 2019.09.23

New Cyanide and Happiness Comic

personal responsibility, Merijn likes this

22 Sep 22:41

עוברים דירה

by דורפן

Yuval Pinter
ביי דורפי

אחרי 9 שנים, 13 אם סופרים את הבלוג שלי כשהיה עצמאי, דה–באזר עוזב את הבלוגספירה ועובר למדיה החברתית. בעיקר לפייסבוק – שם יופיעו הפוסטים – וגם לפלטפורמות אחרות. זו השורה התחתונה כאן למעלה. אפשר לדלג על הסנטימנטים ולגשת לשם ישירות:

לעמוד דה באזר.

הסיבות?

ראשית, רייטינג! כבר תקופה ארוכה אנחנו משווים תגובות וחשיפה של חומרים במדיה החברתית, לחשיפה בבלוג עצמו והמסקנה די ברורה. קיומו של הבלוג כרגע בעצם מפחית במשהו את החשיפה לחומרים. ואנחנו כותבים – כתבנו 16 אלף פוסטים, אני כתבתי 4000 מהם – ורוצים יותר קוראים.

שנית, ניצן חורש. הוא לוקח על עצמו את תפקיד הובלת דה באזר בהתלהבות גדולה, עם נסיון לא קטן בתחום. אז לאחר שיחות קדחתניות איתו – בוא אומר לי את זה בערך שנתיים– יוביל מעכשיו את דה באזר למחוזות חדשים.

ויש עניין שלישי. בנקודת הזמן הזו פייסבוק מייצר דיון איכותי יותר מאשר ״תגובות״״. חלקן טכנולוגיות וחלקן פסיכולוגיות.

והקוראים והדיון הם העיקר ולהם תודה: על 754 אלף תגובות.

כמעט כל הכותבים עוברים איתנו. באופן אישי אני משוכנע שזה יגדיל את כמות הכתיבה שלי. אז יאללה, נתראה בצד השני.

כאן

וכאן:

https://www.facebook.com/WeAreDebuzzer/

21 Sep 18:52

"The applicant does not address the broader impacts of the proposal, other than its potential benefit..."

“The applicant does not address the broader impacts of the proposal, other than its potential benefit to global human health.”

mkvande likes this

09 Sep 16:14

איפכא מסתברא

by אורן פרסיקו

חדשות 13 התנצלו על אמירות שיוחסו לרב ממכינה קדם-צבאית והשעו את הכתב

View attached file (76575478-1-144x81.png, image/png)

03 Sep 23:09

Eyes or Iceberg.

by languagehat

Nick Paumgarten’s “The Message of Measles” (New Yorker, Sept. 2) is well worth reading for the importance of the subject, but Paumgarten is a lively writer with an eye for a good quote, and I was particularly struck by this:

For public-health officials like Zucker, measles was a clear and present concern on its own, but, more significant, it was a leading indicator of a societal failure. Mark Mulligan, the director of the Vaccine Center at N.Y.U. Langone, said, “This outbreak is the eyes of the hippopotamus.”

The eyes of the hippopotamus! What a great substitute for the hopelessly clichéd “tip of the iceberg”! I reproduce it here in hopes that it will get wider use (and perhaps become a cliché in its own right).

21 Aug 13:46

סבר פלוצקר ומלחמתו במציאות

by שוקי טאוסיג

מאחורי הקלעים של המתקפה על הרפורמה בבנקים

View attached file (6547567-12-144x81.png, image/png)

05 Aug 14:58

Dinner Tonight

by Doug

Dinner Tonight

And more spiders.

25 Jul 14:40

The Voracious Reader

by Doug

The Voracious Reader

And more trees.

RaptoR, Cary likes this

22 Jul 18:02

Keeps You Going

by Reza

Vvicked, Cary and 5 others like this

04 Jul 23:19

Profiteer rolls

by Mark Liberman

Seen on a buffet table in Glasgow: "Profiteer Rolls" for "Profiteroles".

There are a fair number of other examples Out There, but not enough to merit a separate dictionary entry, much less to eclipse the original, as in the case of Jerusalem Artichokes.

The OED's etymology:

Apparently < Middle French, French profiterole (although this is first attested later in the sense relevant to sense 1: 1549; 1881 in sense 2) < profit profit n. + -erole , diminutive suffix (extended form of -ole -ole suffix).
French profiterole is attested slightly earlier in its literal sense 'small profit': 1542.

[h/t Bob Ladd]

mkvande likes this

26 Jun 13:40

האצבע הרועדת על ההדק

by Yoav Karny

לא רע אם מתעורר ספק אצל נשיא אמריקאי לפני שהוא מורה על מהלך צבאי עם פוטנציאל הרסני. אבל אם הספק הזה מייצג התכחשות להיסטוריה של מנהיגוּת ושל מעורבוּת, הוא צריך להדאיג את בעלי בריתה של ארה״ב. הנשיא...

זו רק ההתחלה. היכנסו-נא, הדלת פתוחה לרווחה

20 Jun 00:23

Comic for 2019.06.19

New Cyanide and Happiness Comic

personal responsibility likes this

19 Jun 13:41

Saturday Morning Breakfast Cereal - Torture

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
If that doesn't work, the next step is putting in contact lenses.

Today's News:

Eve Liquide, Tindalostalbot likes this

10 Jun 01:58

i'm not even certain if it's gauche to give gifts during funerals. probably best to just play it safe and hold onto that new playstation for later though


archive - contact - sexy exciting merchandise - search - about

THE ONES THAT WENT EXTINCT BEFORE DOING ANYTHING ARE JUST THE ONES THAT WERE NOT PARTICULARLY GREAT AT BIDING

← previous

May 31st, 2019

– Ryan

07 Jun 20:42

Libfix report for June 2019

by Kyle Gorman

You may be familiar with fatberg, a mass of non-biodegradable solids and fats found in sewers, which suggests -berg has been innovated (presumably via iceberg). And now London is also haunted by a concreteberg.

Late great tech unicorn Theranos made use of a proprietary blood-collection device they called the nanotainer (via container), and I recently found out about vacutainer and a security software package called Cryptainer. So -tainer has been liberated.

The other day in Queens I saw a sign for a Mathnasium, presumably extracted from gymnasium, and the Corpus of Contemporary American English also has a token of jamnasium (a space for jam seshes), suggesting a nascent -nasium.

In a recent, widely-derided ad campaign, Applebee’s coined sizzletonin on analogy with the neurotransmitter seratonin and the hormone melatonin, but as far as I know that’s the end of the line for -tonin.

Yuval Pinter

Shared posts

1) Universal unsupervised pretraining

2) Lottery tickets

3) The Neural Tangent Kernel

4) Unsupervised multilingual learning

5) More robust benchmarks

6) ML and NLP for science

7) Fixing decoding errors in NLG

8) Augmenting pretrained models

9) Efficient and long-range Transformers

10) More reliable analysis methods

References

Mental health concerns

Social harassment

Gender bias

Conversations first