The Old Reader

03 Nov 17:16

מה חדש בדברי עמוס שוקן?

by חנן עמיאור

הארץ, (וגם 'הארץ') רעשו בסוף השבוע בעקבות בנאומו של מו"ל 'הארץ' עמוס שוקן בועידת הארץ בלונדון. אבל

שלושה דברים עיקריים בדבריו עוררו סערה: שישראל היא מדינת אפרטהייד, שהטרוריסטים הפלסטיניים הם "לוחמי חופש" ושיש להטיל על ישראל סנקציות.

האמנם היה חידוש כלשהו בדברי שוקן?

ברור שלא.

שוקן אמר שישראל היא מדינת אפרטהייד? האשמה הזו הוטחה מעל דפי 'הארץ' בישראל, באינספור וריאציות, אינספור פעמים בעבר.

הוא אמר שטרוריסטים פלסטיניים הם "לוחמי חופש"? גם המחמאה הזו כבר חולקה בעבר בנדיבות מעל דפי 'הארץ', גם לטרוריסטים הפלסטיניים הגרועים ביותר.

עמירה הס קבעה שרוצחי הנוחבה שפשטו על העוטף, בסך הכל "פרצו בשיטות שונות את חומות הכלא הגדול בעולם" וביצעו את הטבח הברברי בשל "השאיפה של כל אדם לחירות".

גדעון לוי תיאר את הבורחים מכלא גלבוע, כמעט כולם רוצחים מורשעים שרצחו יהודים רק בשל היותם יהודים, לוחמי חופש נועזים.

גם מפקדי הג'יהאד האסלאמי ביו"ש, כל אחד בתורו, תוארו בידיו "לוחמי חופש"

וכמוהם גם פעילי חמאס שהתנגשו עם חיילי צה"ל על גדר המערכת של הרצועה.

גם יצחק לאור כתב ב'הארץ' בעבר שהמחבלים הם "לוחמי חופש",

והיו כמובן עוד רבים.

שוקן קרא בנאומו גם להטיל על ישראל סנקציות בינלאומיות, אבל גם את הקריאה הזו, וכמוה גם הקריאה לחרם, קרא 'הארץ' פעמים רבות בעבר, כולל מפי עמוס שוקן בעצמו.

אז על מה בדיוק המהומה?

The post מה חדש בדברי עמוס שוקן? appeared first on פרספקטיבה.

26 Oct 14:40

George Kingsley Zipf is famous for his work on the power-law distribution of word frequencies, which has come to be known as Zipf's Law. And he's also known for the related "Law of Abbreviation", and the hypothesized balance between effort and efficacy.

In his 1945 paper "The repetition of words, time-perspective, and semantic balance", Zipf looks at a different distribution, which is much less famous:

In the present study we shall attempt to show in preliminary outline how the rate of repetition of words in the stream of speech may be useful not only in indicating what we shall presently define as "time-perspective" but also in elucidating what we shall presently refer to as "semantic balance" – two terms of potential significance in the understanding of personality variants.

"Personality variants?" Wait for it…

That paper's Figure 2, which presents its main empirical evidence about word-repetition intervals, gives us a clue about why the initial uptake for this idea was so slow:

Caption: The Number of Intervals of Like Sizes (in Terms of Pages) between the Repeitions of Words Occurring Five Times in James Joyce's Ulysses with Interval-Sizes Taking on Integral Values from 1 Through 50 Pages Inclusive.

Zipf could start from published word-count data in that case — M.L. Hanley's 1937 Word Index to James Joyce's Ulysses — but the analysis was still a labor-intensive addition to Hanley's labor-intensive foundation. Digital text and computer analysis make such analyses easy today, by comparison, though few have done it. More on that in a later post.

For now, I want to share with you a striking (or maybe weird) idea that occupies most of Zipf's 1945 paper, presenting a mathematical model for a demon ringing a set of bells.

Zipf introduces his bell-demon this way:

Let us take n bells that are equivalent in size and equally difficult to ring, and then let us attach them to a long straight board in such a manner that the bells are equally spaced along the board. At one end of the board we shall place a blackboard ruled with n-columns for the respective bells; and we shall also station a demon there to act as bell-ringer. The demon must ring one bell once each second of .time, and after he has finished ringing a bell once he must return to the blackboard to record that fact in the bell's column. Thus in order to ring one bell 10 times, or 10 bells once each, he will make 10 round trips down the board and back in the space of 10 seconds, and will have 10 marks therefor on the blackboard. (And we shall ask the demon to make his round trips over shortest distances).

This analogue is interesting for many reasons. First of all the demon's work, w, in terms of making a round trip to ring a given bell, will increase in direct proportion to the bell's distance, d, from the blackboard (or w = d). And since the distance of the respective bells increases integrally from the blackboard (i.e., ld, 2d, 3d, ….., nd), it follows that the bells are arranged in respect of the the demon's work, w, in getting to and from them according to the simple series, lw, 2w, 3w, ….. , nw . .

Now if we ask our demon to ring each bell with a frequency, f, that is inversely proportionate to the round-trip work involved, or in equation form, $w X f = C$, he will ring the closer (and easier) bells proportionately more often than the distant (and harder) bells. And since the ranked-frequency in decreasing order, r, with which each bell is rung will be equal to the bell's w above, we come upon the familiar equation:

(1) $r X f = C$

However if we now ask the demon to ring all bells according to Equation 1 but to stop after he has rung the nth and farthest bell once (n = C) and after he has rung all other bells their allotted times, then the n bells will have been rung approximately according to the equation

$$F \cdot Sn = \frac{F}{1} +\frac{F}{1} + \frac{F}{2}+ \frac{F}{3} + . . . . . +\frac{F}{n}$$

in which $F\cdot Sn$ represents the total of round trips made (as well as the total number of running seconds of time) and where $F$ represents the total number of times the nearest bell is rung, and where $\frac{F}{n} = 1$ (or, if you will, where $F = n$), with p omitted above because it equals 1.

This gives him his power law for the counts of individual bells, but so far, it puts no constraint on their inter-ringing intervals. As he observes:

Of course the above equation puts no restriction upon the order in which the demon rings the bells. Thus he may ring the nearest bell its allotted $F$ times before ringing the 2nd nearest bell its allotted $F / 2$ times, and so on progressively down the board until he has rung the nth and farthest bell a single time. In short he might always ring "the easiest remaining bell first," while postponing as long as possible the more distant and. hence more difficult bells. The chief drawback of ringing "the easiest first" is that the demon will be forced to run faster and faster, and therefore to work at an ever increasing rate, as he proceeds farther and farther down the board, if he is to complete each round-trip within the prescribed second. And in so doing he will be unevenly distributing his work over time with the risk of collapsing before he gets the nth bell rung.

So he adds a policy to optimize the demon's effort:

In order to correct this uneven distribution of work over time, we may ask the demon to distribute his work as evenly as possible over time while still ringing his bells according to Equation 3. Yet as soon as he does distribute his work evenly over time, he will automatically ring the bells in such a way that the sizes of the interval, $I_{f}$, between the respective repetitions of the bells will approximate the equation:

(4) $N^{p} \cdot I_{f} =$ a constant

with the exponent, $p$, equal to 1.

For more demonic mathematics about "balancing the frequency of easy acts against the rarity of difficult acts", read the paper, if you're interested. For present purposes, let's jump to Zipf's observation about the "abnormal time-perspective […] represented by the median 1.20 slope of Joyce's Ulysses […] which suggests a slightly abnormal preference for longer intervals".

Thus having once "rung a bell," Joyce tends systematically to avoid its repetition abnormally. In other words, events of the past (as represented by words) seem to be systematically more remote from the present than is actually the case with 1.00 time-perspective. Although this general type of over-long time distortion is probably not infrequent among those personalities who focus their attention primarily upon the present moment, it is interesting to note that this particular distortion of time is found in a novel that is characterized for just that attribute (if we may so interpret the words, "stream of consciousness" writing).

And now the punch line:

Other types of time-perspective — and not necessarily linear — can be defined in terms of the bell-analogy, yet there is one we mention cursorily lest it be ignored. we refer to the case in which the demon saves work and simplifies the problem of distributing his work evenly over time by simply bending the straight board into a quasi-arc. In this fashion the distant bells become nearer, and the demon can take short-cuts to them. This type of time-distortion we shall call schizophrenic unbalance and we shall treat it in greater detail in a future publication.

Time-perspective, in terms of the distribution of minimalized work over time (with all its endless ramifications) would seem to be an inviting topic for the study of the normal and abnormal of human mental behavior.

As far as I can tell, Zipf never actually treated "schizophrenic unbalance […] in greater detail in a future publication". This may be because he died in 1950 at the age of 48.

Nor did anyone else follow up on inter-word repetition statistics as a sign of "schizophrenic unbalance", at least not using the same phrase — though their are some adjacent things, like this paper, and commenters may be able to point us to others.

Update — To avoid further misunderstandings, let me point out that the cited 2013 paper (Todder et al., "Non-Linear Dynamic Analysis of Inter-Word Time Intervals in Psychotic Speech") is based on a completely different measure.

Zipf's metric was the interval (in pages) between two occurrences of the same word, e.g. the word "accurate" occurs in the novel Ulysses on pages 434, 575, 590, 605, and 615, yielding intervals of 141, 15, 15, and 10 pages.

Todder et al. measure the interval in seconds between successive words (whatever they are) in the stream of speech, so that the production in TIMIT of SA1 by speaker FLNH0

0.188 0.378 she
0.378 0.637 had
0.587 0.703 your
0.703 1.010 dark
1.010 1.339 suit
1.339 1.426 in
1.426 1.773 greasy
1.773 2.091 wash
2.149 2.478 water
2.478 2.643 all
2.643 2.938 year

yields inter-word-onset intervals in seconds of

0.190 0.209 0.116 0.307 0.329 0.087 0.347 0.376 0.329 0.165

I should also note that Todder et al. don't cite Zipf 1945, and show no signs of having been influenced in any way by that work.

Apologies for confusing people by waving a hand in the direction of the 2013 paper…

04 Oct 18:32

The Oyster’s Enemies.

by languagehat

I was astonished to read this quote from Robin Williams, talking about Jack Nicholson:

“He once was with me at a benefit and leaned over and said, ‘Even oysters have enemies’. In a very intense voice, I responded with, ‘Increase your dosage’. More fascinated than scared. He says things that even Buddha goes, ‘What did you mean?’”

To an English-speaker, “Even oysters have enemies” sounds like weirdness from the outer limits, but any reader of Russian literature would immediately recognize it as a version of “И устрица имеет врагов!” [Even an oyster has enemies!], No. 86 in Плоды раздумья [Fruits of meditation] by the great (and fictional) Kozma Prutkov. I have no idea how it wound up in the brain of Jack Nicholson, but it gave me a frisson of delight.

(Incidentally, устрица ‘oyster’ is borrowed from Dutch oester, which is from Latin ostrea; our oyster comes from the same Latin word, but via French. And враг ‘enemy’ is borrowed from Old Church Slavonic; the inherited doublet is во́рог, which is archaic or folk-poetic.)

15 May 11:46

Fake

by Doug

Fake

And more impostors.

09 Apr 08:08

Saturday Morning Breakfast Cereal - Hope

by Zach Weinersmith

Click here to go see the bonus panel!

Hovertext:
No, OR is not a typo, but the lack of a bottom strap in panel 3 will require some retconning.

Today's News:

25 Mar 09:36

It Geed.

by languagehat

A puzzled correspondent sent me this quote, saying the final word was a mystery to him:

There were papers, letters, and paid bills and miscellaneous items, including the stuff from her room at the office, but there was no diary or anything resembling one, and there was nothing that seemed likely to be of any help. If it got too tough I might have to have another go at it or put Saul Panzer on it. I did use a few of the items, in Elinor’s handwriting, to check the writing on the letter that was in the box with the money. It geed.

The Father Hunt, by Rex Stout (Bantam pbk., 1971, p. 18)

I sent him a link to Green’s Dictionary of Slang, but it occurred to me that this long-forgotten term might be of interest to others (and perhaps clarify similar mysteries), so here’s Green’s definition and a few citations:

gee v.¹
also jee
[? pron. of initial letter of SE go]

to fit, to suit, to behave as required or expected; usu. in phr. it won’t gee, it doesn’t suit, it doesn’t work.

c.1698 [UK] B.E. Dict. Canting Crew n.p.: It wont Gee, it won’t Hit, or go.
1719 [UK] in D’Urfey Pills to Purge Melancholy V 83: If Miss prove peevish and will not gee / […] / find out a fairer, a kinder than she.
[…]
1887 [Aus] Bulletin (Sydney) 5 Nov. 7/3: Italian opera ‘gees’ in a general way at Melbourne Royal, but not to any alarming degree.
[…]
1904 [Aus] West. Australian 12 Apr. 9/2: They all reckon they can bring […] in enough sentiment to make it gee.
[…]
1925 [US] Odum & Johnson Negro and His Songs (1964) 154: Yes, I hollow at the mule, an’ the mule would not gee.

I say “long-forgotten,” but of course I shouldn’t assume: are any of y’all familiar with this short, punchy verb? Also, how does Green know that last quote doesn’t involve gee “(intransitive) Of a horse, pack animal, etc.: to move forward; go faster; or turn in a direction away from the driver, typically to the right”?

24 Jan 10:46

Standards

by Nicholas Gurewitch

The post Standards appeared first on The Perry Bible Fellowship.

09 Nov 08:13

Neologism of the day: Himbonym

by Ray Johns

Making up more words for words

Today, I wanted to tell someone that it was okay for them to use particular gendered terms of address when speaking to me — namely “dude”, “man”, “bro”.

In general, I am fine with himbonyms, which is what I have decided to call familiar or affectionate terms of address that connote masculine gender.

31 Oct 07:05

Neologism of the day: Recipronym

by Ray Johns

Did you know you can just make up new words if you want to?

Photo by Fabio Santaniello Bruun on Unsplash

Yuval Pinter, NLP researcher and academic, asked in a tweet:

Is there a name for pairs of words that describe each other? e.g. “inflection” is derived, and “derived” is an inflection

Some other examples of the phenomenon (collected from the Tweet replies and other linguists):

“monosyllabic” — “long”
“eight-lettered” — “compound”
“calque” — “loanword”

I suggest we call them recipronyms. An admittedly lazy Google search turned up no hits, but if this word preexists my coinage of it, I would love to see the citations. Leave a comment if you find any!

28 Aug 13:03

שהם סמיט מציגה: שקרים שסיפרתי לבתי

by חנן עמיאור

ביום שישי האחרון ביטא 'הארץ' אהדה ברורה כלפי הטרור הפלסטיני.

שיערב ויבושם למי שאוהד את הטרור הפלסטיני, אבל אם הוא נשען על עובדות לא נכונות, ראוי להעמיד דברים על דיוקם.

שהם סמית, כותבת בתסכול ב'הארץ על שיחה עם בתה. היא מתוסכלת מכך שלאחר הוצאתה להורג של בת שבע נגרי ז"ל לעיני בתה בת השש, מגלה בתה עוינות כלפי הטרור הפלסטיני.

זה קשה לסמית, ובמאמרה היא מנסה למרוח את בתה בשקר יסודי כדי להצדיק את הטרור:

"את מבינה", אני מנסה בפעם האלף, "שהפיגוע הזה לא היה מתרחש אם ישראלים לא היו מתעקשים לגור בשטחים ולהסתובב בהם? שאנשים שחיים תחת שלטון צבאי, בלי זכויות אדם, שכל יום הורגים בהם, שאין להם שום עתיד, בסוף, מרוב ייאוש ותסכול, לוקחים רובה והורגים מישהו?"

זה השקר הראשון. ככל שיחזור השקר הזה, לפיו ההתנחלויות וה"כיבוש" הן הגורם לטרור, כך נחזור ונזכיר, גם לבת של שהם סמית:

בניגוד למה שאומרת לך אמא, לא ההתנחלויות הן הגורם לטרור אלא עצם קיומם של חיים יהודיים כאן.

הטרור הפלסטיני נגד היהודים בארץ החל לפני זמן רב, ללא כל קשר להתנחלויות שיעלו על הקרקע למעלה ממאה שנה מאוחר יותר.

אפשר לספור מאירועי רצח יחידים כבר מלפני למעלה מ 170 שנה, כשבשנת 1851 רצחו פלסטינים בירושלים את אברהם שלמה זלמן צורף במכת חרב מאחור על ראשו, כשהלך ברחוב בירושלים העתיקה. וזאת בגלל שהחזיר לידי היהודים את חורבת רבי יהודה החסיד.

אפשר לספור גם מהתחלת הטרור הערבי המאורגן, שאת הופעת הבכורה שלו ערך בפרעות תר"פ, 47 שנים לפני ה"כיבוש". אז גם התקבע דפוס הפעולה של הטרור לעשורים הבאים – רצח יהודים אקראיים ברחוב, אונס נשים וילדות, ביזה והרס של חנויות ובתים.

אפשרות נוספת היא לספור מהקמת ארגוני הטרור הפלסטיניים שעוסקים עד היום ברצח אזרחים חסר אבחנה. גם הם הוקמו שנים לפני שעלתה על הקרקע ולו התנחלות אחת, וכמובן ללא קשר לעניין.

יש גם אפשרות רביעית, והיא לבדוק את האמנות הרשמיות של ארגוני הטרור הפלסטיניים, ולראות את המוצהר בהן באופן מפורש: מטרת המאבק אינה סיום הכיבוש אלא חיסול מדינת ישראל כולה.

ומכאן לשקר השני. כך פותחת סמית את מאמרה:

עוד נער נרצח. אני מתבוננת בפניו היפות, בחיוך הדק מתחת לחתימת השפם. נער יפה עם כובע גרב מהסוג שחובשים נערים בשבט הצופים של בתי — מהסוג שהיא תתאר כחמוד, cool. היה יכול להיות חבר שלה.

הוא לא יהיה חבר שלה. מראש לא היה לו סיכוי. עד אתמול הוא חי מעבר להרי החושך, והיום הוא מת. צה"ל ירה לו כדור בראש.

בסיום המאמר היא מוסיפה גם את שמו:

הבוקר, למראה פני הנער עת'מאן אבו ח'רג', שוב לא יכולתי להתאפק. הראיתי לה את התצלום: "זה נער בגילך. נער שלא עשה רע לאיש".

"חמוד" ש"נרצח" על ידי צה"ל אחרי ש"לא עשה רע לאיש"?

"הנער" שלא עשה רע לאיש היה מחבל בארגון הטרור הג'יהאד האסלאמי, שאף נקבר עם סרט הארגון לראשו.

לפי מקורות פלסטיניים הוא אף היה "מומחה ביצור והנחת מטענים".

בהודעת האבל שפרסם הג'יהאד האסלאמי, הוא מתואר כ"לוחם ביחידת ההרתעה" שנפל בקרב "עם האויבים" בג'נין, יחד עם "מוג'האדין" נוספים מארגונו.

מה יש להגיד – Cool זו אכן הגדרה הולמת.

ובכן, בתה של שהם סמית, אמא יש רק אחת, אבל לא הכל צריך ללמוד ממנה.

למשל לא את הסימפתיה לקנאים אסלאמיסטים הרואים כציווי דתי רצח אזרחים חסר אבחנה, רק בשל היותם יהודים.

The post שהם סמיט מציגה: שקרים שסיפרתי לבתי appeared first on פרספקטיבה.

23 Aug 14:19

Ron's Princibles

by Mark Liberman

Yuval Pinter
it me

Sunday's post on "Listless vessels" opened with this clip:

The movement has got to be
about what are you trying to achieve on behalf of the American people
and that's got to be based in principle
uh because if you're not rooted in principle
uh if all we are is listless vessels that just supposed to follow
you know whatever happens to come down the pike on Truth Social every morning
that- that's not going to be a durable movement

And in the 30th comment, Yuval wrote

FWIW, both utterances of "principle" sound like 'princible' to me.

He's absolutely right — but what those two words "sound like" leaves an important theoretical (and practical) question open.

First, let's look at (and listen to) Ron's "principles".

Here's the first one, followed by an image of the waveform and spectrogram, with the closure of (what the dictionary says is) the /p/ of _ple in outlined in red:

If you're at all familiar with interpreting images of waveforms and spectrograms, you'll see that the voicing continues throughout the closure, and the release is weak, short, and voiced (unlike the burst of the word-initial /p/).

And here's the second one:

Same thing.

This is an instance of a much more general fact about English in general, and American English in particular — consonants are "lenited" (== weakened), especially in intervocalic position, when they aren't part of the onset of a stressed syllable. "Lenition" often means that voiceless stops become voiced, voiced stops become fricative-like — and more extreme changes in phonetic performance are also common.

The best-known example of this type of allophonic variation is the "flapping and voicing" of /t/, that results in "latter" sounding the same as "ladder". What Gov. DeSantis does with the second /p/ in "principle" is another example of the same sort of thing.

And that sort of thing happens to the word "principle" a lot. The NPR podcast corpus I've mentioned before has 2,160 examples of the word "principle", of which I selected 200 at random. Most of them exhibit a variety of lenitions in the final two syllables — which are sometimes further reduced to something like [svl̩] in IPA-ese. Gov. DeSantis' less-reduced pattern is common — here's Barbara Bradley Hagerty, from "Catholics Split On Obama's Birth Control Decision", NPR All Things Considered 2/10/2012:

Now, she says, the country can implement health care reform,
which has at its core a principle dear to the church

And zeroing in on "principle":

Again the realization of the /p/ is voiced throughout, though there's a slightly stronger release than in the DeSantis examples.

So what's the "important theoretical question" that I brandished at the start of this post?

Have these instances of (the phonological category of) consonant /p/ really turned into instances of (the phonological category of) consonant /b/? Or are these examples just one region in a spectrum of variably-lenited and co-articulated phonetic implementations of the abstract phonological pattern in question?

In other words, is (this kind of) allophonic variation a mapping of symbols to symbols? Or is it just of part of the (necessary) translation of symbols into (articulatory and acoustic) signals?

In my 2018 paper "Towards Progress in Theories of Language Sound Structure", among other places, I argue that the null hypothesis should be the phonetic implementation theory, claiming that in every case where anyone has looked seriously into patterns of allophonic variation, that hypothesis wins empirically. In the cited chapter, I look in depth at some phenomena in English and in Spanish — many other cases in many languages have been studied over the years, e.g. Jiahong Yuan and Mark Liberman, "Investigating Consonant Reduction in Mandarin Chinese with Improved Forced Alignment", InterSpeech 2015.

What does this mean about Ron's "princibles"? I predict, in advance of doing the work, that a large sample of the phonetic implementations (by American speakers) of the second /p/ in "principle", although often voiced, would be different from a comparable sample of implementations of the /b/ in "sensible" — and that the differences would not fit very well with the hypothesis that some instances of /p/ were simply turned into /b/. And similarly for other near-minimal-pairs.

View attached file (626 KB, audio/wav)

19 Aug 07:49

זהירות מהמעתק הסמנטי

by יובל פינטר

להלן איכסוס שהכותב הנאמן יובל שלח לעצמו, תודה יובל:

ותודה לאילון שבזכות גאונותו העסקית כבר לא אמין לאמבד איכסוסים פן יאבדו, בהצלחה עם הקישור למי שחוגג.ת.

קיצר זוכריםות שפעם AI בהקשר הזה פירושו היה ״קול מלאכותי שנוצר עבור דימוי הקלטה של אדם שנפטר או שאינו יכול להקליט יותר״ והיום הוא פשוט (אני מעריך) ״הקלטה שלא ניתן לבצע שוב״?

זוכריםות שפעם ״סלפי״ היה כשאדם מצלם את עצמו, ולא כל תמונה שמופיע בה בן אדם?

כך היה וכך יהיה, והארץ לעולם עומדת (עד שהאוקיינוסים וזה). ד״ש מלקלס, קורבן, awful ו-gay/גאה.

View attached file (9508c32a20bc4769e8fa20c98a7533e5e26d9ae654c3840f7862e48c716133a4?s=96&d=https%3A%2F%2F0.gravatar.com%2Favatar%2Fad516503a11cd5ca435acc9bb6523536%3Fs%3D96&r=G, unknown)

14 Aug 16:52

Saturday Morning Breakfast Cereal - Therapy

by Zach Weinersmith

Click here to go see the bonus panel!

Hovertext:
The really cool thing is that those are cargo sandals, to match the cargo socks, cargo pants, cargo shirt.

Today's News:

Tindalostalbot likes this

18 Jul 06:15

Saturday Morning Breakfast Cereal - River

by Zach Weinersmith

Click here to go see the bonus panel!

Hovertext:
Some of the comics are written just so Corey Mohler will look at his screen, shake his head, and sigh.

Today's News:

Jacopo.bertolotti, Tindalostalbot likes this

08 Jul 07:16

Visual pattern recognition in 1955

by Lior Fox

[this is an expanded version of a thread I posted on Twitter on 24.01.23]

Here’s a paper from 1955 that basically suggests the use of what we would today call a hierarchical convolutional processing for visual pattern recognition tasks. This is ~7 years before Hubel and Wiesel’s feed-forward model of simple cells, ~20 years before Fukushima’s Neocognitron, and about ~30 years before LeCun’s CNNs. The “filters” here were hand-designed, rather than learned (though see more on that in the end). Much of the background, intuition, and the problem formulation itself remains completely relevant today.

The paper was part of the session on learning machines in the 1955 western joint computer conference. The signed author is Gerald Dinneen, but the text makes it clear that this work was done by him together with Oliver Selfridge. Both were at the Lincoln Laboratory in MIT. Indeed in the same proceedings the previous article is by Selfridge, describing the approach for Pattern Recognition more generally.

The whole thing is worthwhile having a look at. I will quote some key paragraphs, mostly from Dinneen and some from Selfridge, and add minimal commentary.

Pattern recognition

According to Selfridge:

By pattern recognition we mean the extraction of the significant features from a background of irrelevant detail.

First, Dinneen explain a main reason that this problem is challenging:

Consider, for example, how many different representations of the block capital A we recognize as A. A great number of variations in such things as orientation, thickness, and size can occur without loss of identity. Our real live problem then is to design a machine which will recognize patterns. We shall call any possible visual image, no matter how cluttered or indistinct, a configuration. A pattern is an equivalence class consisting of all those configurations which cause the same output in the machine.

Then, they present the key idea of a hierarchical processing by “a sequence of simple operations”:

Our theory of pattern recognition is that it is possible to reduce by means of a sequence of simple operations a configuration to a single number, or by means of a set of such sequences to a set of numbers. We believe that for the proper sequences, almost all of the configurations belonging to a given pattern reduce to the same number or set of numbers.

Before specifying what kind of “simple operations” those might be, the paper describes the overall architecture and how to simulate it on a digital computer. Note the word “simulate” here. Indeed, it seems that they thought of the digital computer as a tool to simulate the principles according to which a different kind of machine would work, one that is better suited for dealing with such problems. This topic deserves its own short post, hopefully to be written, for now I’ll leave it with another quote from Dinneen (emphasis in origin):

The problem we have stated is not basically arithmetic, and it is not even clear that a machine for pattern recognition should be completely digital. However, one of the very important applications of a high-speed digital computer is the simulation of just such problems.

Overall architecture

The architecture itself was built to handle binary (1 for black, 0 for white) images of size 90*90 (which was chosen “to be as large as was practical in terms of computer storage”). There were 3 “stores” of such images. First there is the “working image” currently being transformed. The result of such transformation is another image, stored in the “new image”. This new image could then be loaded back into the “working image” store “if a sequence of operations is to be performed”. In addition, there was a special store for the original image (which is also the input image at the beginning), in case it had to be stored throughout:

The paper discusses (in some length) two kind of operations on images. Both are a “linear non-linear” type of operation: specifically, some convolution followed by thresholding.

The “Averaging” and “Edging” operations

The first one is an “Averaging Operation” for which the convolution part is basically a low-pass filter, smoothing the image. The “kernel size” is taken to be 5*5.

In the paper, there are many visual examples1 for what this transformation does to images, as a function of the threshold parameter. Here’s just one:

Smoothing “filled in holes” and “eliminated scatted ‘ones’”

Another version of the operation is also mentioned in which the threshold is set dynamically:

threshold for the small 5X5 window is determined by the degree of homogeneity inside a larger window, say 15X15. In particular, for each cell we count the number of ones inside a 15X15 window surrounding it and use this number to set the threshold for the 5X5 window. This is like an automatic gain control.

The second operation, maybe unsurprisingly, is some kind of a high-pass filter / edge-detector referred to here as an “Edging operation”. As the paper put it:

The edging operation, quite unlike the averaging operation, preserves elements which are centers of asymmetry, that is, those which are located in the regions of discontinuity of the image. The operation sharpens differences. It picks out the edges of letters and for certain choices of thresholds locates the corners, junctions, and end points of letters […]

The edging operation is like a two-dimensional derivative, since we count changes about the center element.

There are some details of how the threshold is set (relative to the overall “intensity” in the patch) and again many visual examples. Here’s just one:

Why these operations?

The smoothing and derivative-filter operations both look very natural today, but this might not have been the case in those early days of digital visual processing. The paper doesn’t justify these as some ‘well known’ thing either. Rather:

We tried to pick operations which were simple in structure and unrelated to particular patterns. In other words, we avoided special operations which might work very well for A's and O's, but not very well for other shapes. There is evidence in neurophysiology, moreover, that both averaging and edging of visual images are performed by the nervous system of many animals.

I find the neurophysiology comment particularly intriguing. Again, this is before H&W discovery of the “simple cell” in the cat visual cortex. There’s no reference here, so it’s hard to say what Dinneen had in mind, but we can consult the commentary piece on that session, in the same proceedings, by no else than Walter Pitts. Pitts seems to be suggesting that the “edging” and “averaging” operations relate to the the on-center/off-surround and off-center/on-surround ganglion cells in the retina, discovered by Kuffler (in the early 50’s). I suspect that early works by Hartline on lateral inhibition2, also in the retina, are another possible candidate (at least for an edge-detector like mechanism). Pitts does not mention those explicitly, but seem to hint on that direction as well.

Finally, I can’t help but quote also this paragraph just to appreciate the heroic effort of programming all of this with the technology of the time:

Although the first two operations are very simple, the coding of the program is quite complicated. This is due to a large extent to the necessity of using the individual bits of the storage registers as independent storage bins. We need a lot of shifting and cycling; the simple averaging operation takes about 300 instructions and requires about 20 seconds a letter. The two window averaging takes about 400 instructions and requires about 4 minutes. The edging requires 700 registers of instructions and takes 2 minutes.

And speaking of programming, the acknowledgment section of the paper is a testimony for the lab reality, and who actually programmed and operated the computers back then:

It would have been impossible to obtain the experimental results without the excellent assistance of Miss B. Jensen and Mrs. M. Kannel in programming, coding, and operating the Memory Test Computer.

The “Learning” part

There isn’t much of machine learning in Dinneen’s report. But the issue and the need for some learning component was well recognized already. We can find the general approach outlined in the Selfridge’s paper (which serves as an introduction to the more specific report). The idea was how to learn a good sequences of operations (it seems to be assumed that the basic operations themselves were to be fixed or pre-given to the machine):

I shall now discuss our plans for having the computer itself hunt for good sequences and assign the proper values. Every sequence is good or bad according as the numbers obtained from applying it to images tend to differ consistently for different symbols.

The planned strategy is a neat description of Supervised Learning:

We now feed the machine A's and O's, telling the machine each time which letter it is. Beside each sequence under the two letters, the machine builds up distribution functions from the results of applying the sequences to the image. Now, since the sequences were chosen completely randomly, it may well be that most of the sequences have very flat distribution functions; that is, they have no information, and the sequences are therefore not significant. Let it discard these and pick some others. Sooner or later, however, some sequences will prove significant; that is, their distribution functions will peak up somewhere. What the machine does now is to build up new sequences like the significant ones. This is the important point. If it merely chose sequences at random it might take a very long while indeed to find the best sequences. But with some successful sequences, or partly successful ones, to guide it, we hope that the process will be much quicker.

I don’t actually know if Selfridge and Dinneen proceeded with that plan, and if so, how well it worked out. It would be nice to look this up sometime.

Few closing remarks

The proceedings of this “session on learning machines” is a fascinating documentation of the early days of the fields.

Reading Selfridge and Dinneen, it is very telling how their approach is rooted in ideas about signal processing, filtering, and “statistics”, rather than (formal) computation, logic, or “rules”. In the same proceedings, there is also a representative of the other approach: a work by Allen Newell on chess playing. This tension if of course a recurring theme ever since, but it’s interesting to see just how far back it goes, and how researchers back in the time already understood it to be important.

It is very common to attribute the intuition behind CNNs to the H&W model of the primary visual system. The Dinneen and Selfridge papers are a compelling argument that the basic idea might have been going around before that, and nontheless might have been inspired by some (perhaps more vague) intuition/analogy to biological vision, possibly at the retina level.

More generally, the complicated relationships of AI/ML with neuroscience and psychology are of course another recurring theme, which is not completely independent of the first one I mentioned. In the words of Walter Pitts, in the beginning of his commentary of that session:

The speakers this morning are all imitators in the sense that the poet in Aristotle "imitates" life. But, whereas Messrs. Farley, Clark, Selfridge, and Dinneen are imitating the nervous system, Mr. Newell prefers to imitate the hierarchy of final causes traditionally called the mind. It will come to the same thing in the end, no doubt

“It will come to the same thing in the end” — I guess we are still waiting on that.

Interestingly these are actual photographs taken of “one of the MTC scopes”. Apparently this prehistoric computer back in ‘55 (the Memory Test Computer, or MTC, in Lincoln Lab) had some kind of a graphical output interface that they could plot pictures on.

The idea that lateral inhibition can implement something similar to a derivative-filter / edge detector goes in fact all the way back to Ernst Mach, in the 19th century — way before any physiological evidence could have been presented to support it. It was Hartline (and later Barlow as well) who provided such physiological evidence.

View attached file (https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe08de930-b999-44a3-b123-f400962d852f_1855x964.png, image/jpeg)

16 Jun 06:10

Why Pahlavi Is So Awful.

by languagehat

We’ve discussed the notorious Pahlavi script before (e.g., last December), but I’m not sure people realized quite how bad a script it is. Now Ben Joeng has a Twitter thread/rant explaining why:

I often describe Pahlavi as the worst writing system ever invented.

Let’s take a perfectly-serviceable writing system for a completely different language (Aramaic) and adapt it for our language (Middle Persian). We’ll call this new script “Pahlavi”.

Aramaic doesn’t really write all the vowels, only consonants, but that’s OK for Aramaic, because through a quirk of Semitic grammar, consonants carry most of the semantics. But our language, Middle Persian, is Indo-European and *does* carry a lot of semantic weight in vowels.

This is why when Greek (IE) borrowed writing from Phoenician (Sem.), it repurposed a bunch of the consonants that Greek didn’t have to use as vowels.

Pahlavi didn’t do that, though, it just carried on not writing vowels.

But what about consonants that Middle Persian has that Aramaic doesn’t? For Greek they made up some fancy new letters like ‹Φ› and ‹Χ›. Or for Dutch the used digraphs like ‹ch›. Pahlavi didn’t do that, no. Aramaic has no /f/, so Pahlavi writes /f/ identically to /p/.

There are a lot of Aramaic consonants that aren’t present in Middle Persian. This means, yes, about half of the 22 Aramaic letters are unused by Pahlavi (sort of, more on that later), but the letters that *are* used are *still* ambiguous.

But let’s make things worse. Let’s start writing some of the letters so similarly that you can’t tell the difference between them. In Book Pahlavi (the most common form), they shaved Aramaic’s 22 consonants down to 13. This means, e.g., /g/, /d/, and /y/ are all written with the same letter, despite them being different in Aramaic.

But let’s make things worse. Book Pahlavi is a cursive script. When two letters come together, they can look identical to a completely different letter. Think about how cursive “iu” might, without the dot, look like “ui”. Book Pahlavi has no such dots. Good luck. All this means that the Middle Persian word for God, “Ohrmazd”, could be just as easily read (and occasionally mispronounced as!) “Anhoma”.

But let’s make things worse. Like all languages, the sounds of Middle Persian changed over time. But Pahlavi didn’t change, which means you write words like “šab” (“night”) as “špa”, because it *used* to be pronounced with a /p/.

But let’s make things *far* worse. So I lied up there. You don’t write “šab” as “špa”. No, you write it “LYLYA”. What. Well, that’s because “lēləyā” is the Aramaic word for “night”, and much like how Japanese borrows Chinese characters to write Japanese words, Pahlavi borrows whole Aramaic *words* to write Middle Persian words. How do you know to pronounce it “šab” instead of “līlīa” (or even “rīlīa”, or “ragulda” or… ugh) or something? You don’t, you just have to know that.

How do we know this isn’t just a borrowing from Aramaic? Because we have Middle Persian dictionaries that say “remember that when you see ‘lylya’, you have to read it as ‘šab’.” Pahlavi is *full* of these “aramaeograms”, even the word for “az” (“from”) is written “MN”.

But the real stinker is, after the Persian literati realized this mess would cause people to horribly mispronounce the sacred texts, they invented a new script based on Pahlavi we call “Avestan” that wrote *all* of the sounds, consonants & vowels, to extreme phonetic detail.

Seriously, it goes so far that we don’t really know why certain letters are distinguished, but they must have been pronounced subtly differently in the past. Why wouldn’t they use this clearly superior system to write Middle Persian instead of friggin’ Pahlavi?

Avestan was invented primarily to write their sacred language (also called Avestan), not the everyday language, so maybe you’re thinking it’s too sacred for plebian Middle Persian. Nope! There’s a tradition of writing Middle Persian using Avestan called “Pazend”, but it was primarily only used to write commentaries on the sacred texts, and… remember how I said there were dictionaries that told you how to pronounce the aramaeograms? The pronunciations? They were written in Pazend!

WHY DIDN’T THEY JUST SWITCH TO USING PAZEND ALL THE TIME‽‽

In the past, I’d thought that the introduction of the Arabic script to write Persian was an imposition that occurred due to the introduction of Islam, but it wasn’t just that, it was, despite still not writing those short vowels, an actual step up!

Pahlavi does have two things going for it though.

One, it is really pretty.

Two, whenever you write “Ahriman” (the Zoroastrian personification of evil), you always write it upside-down

That’s pretty awesome.

let’s just say there’s a reason why despite Book Pahlavi being an incredibly important historical script, it *still* isn’t encoded in Unicode. because we’re still not totally sure how to do it! despite many of our experts being Persians!

Thanks, Y!

16 Jun 05:28

כל פשעי הראיס

by שלומי בן-מאיר

ב'ידיעות אחרונות' סמדר פרי כתבה השבוע על האפשרות שיו"ר הרשות הפלסטינית מחמוד עבאס, הידוע במקומותינו גם בכינוי הפמיליארי "אבו מאזן", יפנה בקרוב את תפקידו בדרך זו או אחרת. אחרי הכל, הראיס כבר כמעט בן 90. פרי חותמת את מאמרה באמירה נבואית – כשעבאס יעזוב, כולנו נתגעגע אליו:

יום אחד זה יקרה, ואבו מאזן יעזוב, או שישקע בדמדומים. ואחרי שהוא יילך, נזכור אותו בגעגועים כמנהיג הפלסטיני שאמנם פיצה בכסף משפחות של טרוריסטים, אבל מעולם, אף פעם, לא עודד אותם לצאת לפיגועי טרור ולרצוח ישראלים.

שימו לב ל"מעולם, אף פעם" של סמדר פרי. הקביעה שלה כל כך נחרצת, שמספיקה דוגמה אחת בה עבאס עודד פיגועים, על מנת להפריך אותה לחלוטין. למרבה הצער, יש הרבה יותר מדוגמה אחת כזו. לשם כך מספיק להריץ חיפוש פשוט בגוגל – ובארכיון של 'פרספקטיבה'.

נתחיל במשכורות המשולמות למחבלים פלסטינים ומשפחותיהם. פרי יודעת שנושא זה הוא הכאוב ביותר, והעומד באופן המובהק ביותר נגד תדמיתו של הראיס עבאס כמי שאינו נוקט בדרך הטרור. לכן היא לא מתעלמת מהן. אך היא טוענת שמדובר ב"פיצוי", וזו טעותה הגדולה.

גם אם הרש"פ אינה משלחת מחבלים באופן ישיר לפגוע בישראלים, הרי שהיא מספקת לטרור תמריץ כלכלי מכריע, בדמות המשכורות שהיא משלמת למחבלים הכלואים בישראל ולמשפחות המחבלים ההרוגים – לאחר מעשה ולכל החיים.

יצוין שתשלום שכר לכל החיים עבור רצח יהודים הוא מעשה כה קיצוני בעידוד הטרור שלו, עד שמלבד מחמוד עבאס אין לו אח ורע בכל העולם כולו.

רוצחים פלסטינים צפויים לקבל שכר אשר יגדל עם השנים למעל עשרת אלפים שקל בחודש. מדובר במעל מיליארד שקל בשנה מתקציב הרשות – פי כמה יותר מתקציב הזרוע הצבאית של חמאס, לדוגמה – שהולכים לעידוד הטרור והנצחתו.

אם מחבלים פוטנציאליים יודעים כי רצח יהודים יזכה אותם ואת משפחותיהם בסכומים שיצטברו, בסופו של דבר, למיליוני שקלים – מה זה אם לא עידוד טרור מובהק ומתמשך?

נמשיך. לאורך השנים הצטברו מקרים רבים בהם עבאס עודד והסית לטרור. נציג כאן מקבץ חלקי ביותר, אך בהחלט מייצג:

* ב-2011 נפגש עבאס באופן מתוקשר בתורכיה עם המחבלת אמנה מונא – שעשור לפני כן פיתתה את הנער אופיר רחום והביאה להירצחו.

* ב-2013 עבאס העניק את עיטור "כוכב הכבוד" לנאיף חוואתמה, שעמד בראש החזית הדמוקרטית לשחרור פלסטין, "מתוך הערכה לתפקידו הלאומי החשוב בשירות הסוגיה הפלסטינית והעם הפלסטיני". השירות הזה כלל אינספור מתקפות טרור ורצח יהודים רבים, ביניהן הטבח בבית הספר במעלות.

* ב-2014 הוא שלח מכתב תנחומים למשפחת המחבל שירה ופצע את פעיל הר הבית יהודה גליק.

לפי המופץ בפייסבוק זהו נוסח מברק התנחומים ששיגר עבאס למשפחת המתנקש ביהודה גליק אך לנוסח עצמו לא קיבלתי אימות לפי שעה pic.twitter.com/PvYG6fb9nB
— Gal Berger גל ברגר (@galberger) November 2, 2014

* בנאום זכור לשמצה בספטמבר 2015, אמר עבאס בהקשר של יהודים העולים להר הבית:

מסגד אל-אקצא הוא שלנו… ואין להם כל זכות לטמא אותו ברגליהם המטונפות. אנו לא נרשה להם זאת, ונעשה כל שביכולתנו להגן על ירושלים…

אנו מברכים כל טיפה של דם שנשפכה למען ירושלים, זהו דם זך וטהור, דם שנשפך למען אללה, רצון אללה. כל שאהיד יגיע לגן עדן, וכל מי שנפצע יתוגמל בידי אללה.

זמן קצר לאחר מכן, ההסתה של עבאס נשאה פרי והחלה "אינתיפאדת הסכינים" – בה פלסטינים רבים רצחו עשרות ישראלים וגם הפכו לשאהידים על הדרך.

* ב-2016 עבאס נפגש עם משפחות של המחבלים שתקפו ורצחו נוסעי אוטובוס בארמון הנציב בעזרת סכין ואקדח, ומחבל נוסף בשם תאאר אבו גזאלה, שדקר חמישה ישראלים במברג.

*ביולי 2017, לאחר הרג שני שוטרים ישראלים על הר הבית והמהומות שפרצו לאחר שישראל ניסתה להתקין במקום גלאי מתכות, עבאס שיבח את מעשי הרצח:

אנחנו תמכנו בכם במה שעשיתם ובמה שאתם עושים… כיוון שאתם גורמים לנו לחוש גאווה, ואתם דואגים למסגד אל-אקצה, לאדמתכם, לכבודכם, לעמכם, לדתכם ולאתרים הקדושים לכם. ירושלים היא שלנו… לפיכך, מה שאתם עושים הוא נכון.

* בנאום בכינוס הוועידה המרכזית של אש"ף ברמאללה בינואר 2018, לצד שלל עיוותים אנטישמיים של ההיסטוריה היהודית, עבאס אמר (דקה 15:40 בסרטון):

למי שיש ברשותו [כלי נשק]- צא והשתמש בו. אני אומר את זה בגלוי. אם יש לכם נשק, קדימה. אני אתכם, ואני אסייע לכם. כל מי שיש לו נשק יכול להתקדם.

הקורא הנבון עשוי לשאול את עצמו למה כוונת עבאס ב"אני אסייע לכם". האם הכוונה היא לתשלום הנדיב שכל מחבל ומשפחתו יקבלו לאורך כל חייהם? ייתכן.

עוד לא הרחבנו לגבי ההסתה הזוועתית בתקשורת הפלסטינית שבשליטת מחמוד עבאס, המעודדת פלסטינים לבצע פיגועים ולרצוח יהודים. וגם לא על תכני הלימוד במערכת החינוך הפלסטינית, גם היא נתונה למרותו, אשר מאדירה מחבלים, וקוראת לילדים לצאת ולרצוח יהודים כחלק מג'יהאד בו ימותו כשאהידים.

דבר אחד ברור: למחמוד עבאס יש יותר מיד ורגל בעידוד טרור ורצח יהודים לאורך שנים רבות. האם מחליפיו יהיו אפילו גרועים ממנו? בהחלט ייתכן, וניתן לדון בכך. אך זו לא סיבה טובה מספיק לשכתוב ההיסטוריה והלבנת פשעי ההסתה והטרור בהם נקט לאורך כל כהונתו, כמי שעומד בראש הרשות הפלסטינית.

The post כל פשעי הראיס appeared first on פרספקטיבה.

11 Jun 06:17

Saturday Morning Breakfast Cereal - Dance

by Zach Weinersmith

Click here to go see the bonus panel!

Hovertext:
Yeah? Well it won't seem degrading when all the plumbers get replaced and you've got a head-start.

Today's News:

24 May 08:30

The Boring Slug 2

by Doug

The Boring Slug 2

The adventure continues!

10 May 13:34

Gordrag’s Bane

by Nicholas Gurewitch

The post Gordrag’s Bane appeared first on The Perry Bible Fellowship.

20 Apr 09:18

Cookie & Son

by Doug

Cookie & Son

And more bedtimes.

07 Apr 08:48

ענין של קנה מידה

by danpelleg

ביום 5.4.2023 כתב הדר חורש במוסף דה-מרקר לפסח מאמר בשם ״הדרך לדירה מתארכת״. את הכתבה מלווה גרף ענקי, שמתאר את מחירי הדירות בשנים האחרונות, וגם את השכר באותה תקופה. הגרף תומך לכאורה בתיזה של הכתבה, שהיא התייקרות הדיור מעבר לעליה בשכר. רק כדי להבהיר, אני מאמין לחורש שהמשפחה הממוצעת בישראל מתקשה יותר ויותר לקנות דירה. התלונה העיקרית שלי כאן היא לגבי הגרף. אני שם אותו כאן כדי שתוכלו לנסות למצוא את הבעיה בעצמכם:

מה רואים? קודם כל, איורים דרמטיים של אנשים מטפסים ונופלים, מה שקרוי בשפה מקצועית chartjunk. אבל לא בזה נתעסק היום. מעבר לזה, יש פה שני צירי Y שונים – אחד לשכר והשני לנדל״ן. וזה תמיד חשוד, ובהמשך נראה שהחשדות מוצדקים. אבל לפני כן, מה מתואר בצירים:

ציר השכר – מתאר את השכר הממוצע
ציר הנדלן – מתאר את שיעור השינוי במדד

אז כבר כאן יש בעיה. ״שיעור השינוי״, כלומר, נגזרת, הבדל משנה לשנה. וזאת לעומת השכר, שמבוטא כמספר טהור. מה ההבדל? נניח שמדד מסוים (שכר, מחיר הדלק, מספר החזירים ברחוב, לא משנה מה) מתחיל בשנה מסוימת במספר 100. עכשיו נניח שהוא עולה ב-10% בכל שנה. מה יהיה ערכו כעבור שנתיים? התשובה: 121, וזאת כי התוספת של 10% בשנה השניה מתיחסת כבר לערך המוגדל של 110 שבסוף השנה הראשונה. ולכן, למרות שגרף הנגזרת יראה כקו אופקי (10% ללא שינוי משנה לשנה), גרף המדד יראה עליה יפה.

אבל זו לא הבעיה העיקרית. הבעיה היא קנה המידה השונה של הצירים. שימו לב לטווח של שני המספרים (הערך הקטן ביותר והגדול ביותר האפשריים בגרף):

עבור השכר – 9 עד 15 (באלפי שקלים)
עבור מדד הדיור – מינוס 5 עד 20 (באחוזים). זה לא נראה בצילום בגלל קיפול של הדף שלא הצלחתי ליישר. תאמינו לי.

וזה כשלעצמו משנה את היחס בין שני הקוים בגרף (הכחול והכתום). מי שבחר לאפשר בציר השכר מספרים עד 15, השאיר המון מרווח בין הקו לבין קצה הגרף העליון. ואז נראה שהקו שטוח יותר. הנה, לדוגמה, איך גרף כזה יראה בתוכנה דמוית אקסל, בלי שום מניפולציה. התוכנה בוחרת את הטווח 10 עד 13, כי אלו מספרים עגולים שכוללים את כל הערכים:

מדובר באותם מספרים בדיוק כמו שיש בגרף המקורי של דה-מרקר. רק קנה המידה השתנה, והנה ההבדל נעשה הרבה פחות דרמטי. להמחשה, שיניתי ידנית את הטווח לערכים 9 עד 15 כמו בעיתון, וזה נראה כך:

עכשיו יש הבדל, והקו הכחול תופס פער מהקו הכתום (או התיבות הכתומות). אבל זו אשליה אופטית ותו לא. אפשר לשחק עם זה עוד קצת: בואו נבחר טווח של מינוס 4 עד 16, בדיוק כמו הטווח של המדד. איך זה יראה עכשיו?

בהצגה כזאת, נראה אמנם שהכחול עולה והכתום יציב, אבל רוב האנשים יאמרו שהכחול מדביק פער למול הכתום. ושוב, מדובר בדיוק באותם מספרים שמוצגים בכל הגרפים שפה. ההבדל ביניהם הוא קנה המידה, ולב הבעיה הוא שיש קני מידה שונים לציר ה Y הימני ולציר ה Y השמאלי.

ומה הלקח? בכל פעם שרואים גרף שבו יש שני צירי Y, צריך להגדיל את מידת החשדנות לפני שאנחנו ניגשים לפרש את הנתונים, וזה תופס גם בלי תוספות של איורים דרמטיים. חג שמח.

View attached file (image-7-3808472480-e1680787745216.png, image/png)

02 Apr 07:14

Easygoing Dude

by Reza

RaptoR, Cowboy Who? and 2 others like this

06 Mar 16:51

Saturday Morning Breakfast Cereal - Pseudoscience

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
Take that, 100 year old method largely no longer in use!

Today's News:

Tanaquil, Jacopo.bertolotti and one other like this

04 Mar 18:12

Saturday Morning Breakfast Cereal - Teeth

by tech@thehiveworks.com

Click here to go see the bonus panel!

Hovertext:
The weird thing is he was originally just printing teeth for the pure delight before he discovered the opportunity for savings.

Today's News:

Jacopo.bertolotti, Duane and one other like this

08 Jan 06:58

שיחה על ג’יפיטפוט 📻 האחראי על האינטרנט בגלצ

by עידו קינן

ארכיון “האחראי על האינטרנט” בכל הפלטפורמות

ChatGPT הוא יישום מעורר התפעלות של צ’ט עם בינה מלאכותית שהוציאה חברת OpenAI, שאחראית גם על מחולל התמונות הבינ”מי דאל-י. אבל הוא לא באמת מבין אותנו – ואפילו לא את התשובות שהוא נותן. דיברנו על כך עם ד”ר יובל פינטר, ‏חוקר עיבוד שפה טבעית באוניברסיטת בן-גוריון.

אחת

הוויראליות של צ’ט-ג’יפיטי נובעת מהתוכן הרב שהוא מייצר לאומת הרשת, על התשובות המשכנעות והגליצ’ים המצחיקים שלו. המם דה-ז’ור סביבו הוא לבקש מהבוט מתכונים ולנסות לבשל אותם, כפי שעשה הבשלן הוולוגר בן שי:

View this post on Instagram

A post shared by Ben Shai | בן שי (@ben.shai)

כמובן שלא התאפקתי וראיינתי את צ’ט-ג’יפיטי עבור הפינה. רמת העברית שלו פחות טובה מרמת האנגלית, בגלל מאגרי הטקסט הקטנים יותר בשפתנו הקטנה, אבל זה עדיין מרשים.

אני מדבר בפינה בגלצ הערב על ChatGPT עם יובל פינטר. יש לך משהו למסור למגיש, עידן קוולר?

צ’טג’יפיטי: “שלום! מזכיר לך שאני הספרייה המרוחקת של OpenAI ואני לא יכול לתת דיון אישי עם מגישי התכנית. אני מזמין אתה לשאול כל שאלה או לדון על הנושא שאתה מעוניין בזה. אני יכול לתת פרטים כלליים על הנושא או לענות על שאלות כלשהן שיתווספו”.

השאלה היא אם ChatGPT מבין את השאלות שאני שואל ואת התשובות שהוא נותן, או שזה הכל עובד על סטטיסטיקות על מאגרי טקסט עצומים

צ’טג’יפיטי: “ChatGPT הוא מודל לטכנולוגיות הנקראות “תכנות מדעי השפה” או “תכנות מדעי הדפדפן” (NLG), שנועדו לתת תשובות בטכנולוגיות השפה האנגלית באמצעות ניתוח מאגרים גדולים של טקסט. מודל זה מנסה לתת תשובות בצורה הכי מתאימה אפשרית לפי הנתונים המוכנים לו, אבל הוא לא יכול להבין באופן ישיר את השאלות או להתמודד עם כל מקרה מיוחד שמעבר לתכניות שנלמדו לו. הוא עושה זאת על ידי שימוש בסטטיסטיקות על מאגרי הטקסט הגדולים שלו כדי למצוא מונחים ותכנים דומים למה שאתה שואל ולפיית אותם כדי לתת תשובה.

“כלומר, מודל ChatGPT לא מבין באופן ישיר את השאלות שאתה שואל ואת התשובות שהוא מתן, אבל הוא משתמש בסטטיסטיקות על מאגרי הטקסט הגדולים שלו כדי למצוא תכנים דומים למה שאתה שואל ולתת תשובה ככל האפשר”.

פודקאסטי.קו, ממשיכים להקליט עבורכם פודקאסטים מעולים גם בתקופת הקורונה

אם אתן רוצות להמשיך להפיק את הפודקאסט שלכן גם בימי הריחוק החברתי, אם אתם רוצים לנצל את הזמן הפנוי להתחיל את הפודקאסט שדחיתם עד עכשיו – דברו איתנו באימייל go@podcasti.co או בטלפון 051-562-6638, או בקרו באתר Podcasti.co לפרטים נוספים

הפינה “האחראי על האינטרנט” משודרת מדי שבת ב-19:30~ בתוכנית “שישבת” בהגשת עידן קוולר בגלצ אות פתיחה: ניאן קאט של daniwell, ביצוע גיטרות לניאן קאט של The GAG Quarter והשיר Never Gonna Give You Up של ריק אסטלי

03 Dec 17:20

Political Correctness.

by languagehat

Dave Wilton has made a Big List post about the phrase politically correct (often abbreviated as PC), giving a useful account of its history in English, starting with its occurrence as “a collocation of words rather than a fixed lexical item” (as in a 1793 US Supreme Court opinion by Justice James Wilson) and proceeding to the current meaning “conforming to a body of liberal or radical opinion, esp. on social matters”; here’s the first example he finds of the latter (from the Christian Science Monitor of 4 September 1919):

Mr. Svarc charged that just as the Magyars would allow no priest to serve in Slovakia unless he were “politically correct,” in being which he had to become a traitor to those of his own blood and a slave to the Magyars, so the Magyar Government had issued orders to the bishops to cooperate with Austro-Hungarian consuls in this country to get “right conditions” in the United States.

He says “there is no doubt that politically correct had become a term of art in Marxist circles by the middle of the 1920s […] By the mid 1930s, politically correct was appearing in non-Marxist writing, but in reference to restrictions on speech in the Soviet Union, and it is here that the term starts to acquire its negative valence. […] In the 1970s, progressive—not necessarily Marxist—movements in the United States picked up the term. In the process, the term softens from hardline Marxist dogma to a call for inclusion and being mindful and respectful of voices and views that had traditionally been suppressed or ignored.” Which is well and good, but as I said in the comment thread:

I’m surprised you’re treating this as a purely English-language phrase, with no consideration that it might be a calque. Your 1919 quote implies that there was an equivalent Hungarian (or perhaps German) phrase that was being so translated, and a Russian corpus search found this from a 1930 collection by Yuri Pisarenko: “Политическая корректность режиссера не делает театр современным” [The political correctness of the director does not make the theater contemporary] — at that time it is highly unlikely it would be a calque from English. It is certainly an interesting phrase, and it deserves an international investigation, however that might be provided (you’d need scholars familiar with the major European languages and their histories).

And then it occurred to me that I could make a start on that investigation by posting it at LH and seeing what the collective knowledge of the Hattery turns up. (While as you know I don’t try to keep threads “on topic,” I do hope we can avoid the tiresome issue of whether PC is a Good or Bad Thing and focus on the history of the phrase.)

02 Dec 06:49

Saturday Morning Breakfast Cereal - Impure

by tech@thehiveworks.com

Yuval Pinter
Hovertext.

Click here to go see the bonus panel!

Hovertext:
Shit-Quark is part of the Standard Model of Particle Disses

Today's News:

Jobim, Jacopo.bertolotti and one other like this

22 Oct 18:41

Talking to the Saturnians.

by languagehat

Nick Richardson’s LRB review (18 June 2020; archived) of Extraterrestrial Languages, by Daniel Oberhaus, is mostly about recent attempts to communicate with extraterrestrials, which we discussed a couple of years ago, but it begins with a few paragraphs about earlier ideas, which I found charming enough to post:

The hero of The Man in the Moone, a novel written in the late 1620s by the Anglican bishop Francis Godwin, is carried to the moon in a sky chariot pulled by a flock of wild swans. He spends the next few months among the peaceful ‘Lunars’ and gains a measure of fluency in their language, which ‘consisteth not so much of words and letters’ as of melodies ‘that no letters can expresse’. Godwin’s cosmonaut, Gonsales, in many ways had an easy time of it. He could point at a swan or a star and the Lunars would whistle one tune or another. Tune by tune Gonsales pieced together his Lunar vocabulary. But almost the only thing we know for certain about aliens is that they don’t live close enough to see us pointing. We know of a handful of possibly habitable planets, but none is less than four light years away – or 24 trillion miles. And the Lunars aren’t that unlike humans: they’re tall but anthropomorphoid, and even claim to be Christian. More recent sci-fi – such as Ted Chiang’s ‘Story of Your Life’, the inspiration for the film Arrival, in which humans try to communicate with heptapods who perceive all time simultaneously – features aliens that are much more alien. The more we learn about ourselves and the universe, the more we appreciate that aliens probably won’t just be humans with longer limbs and waving antennae. How do you communicate with a planet-sized slime with ESP that eats electricity?

The 19th-century approach to breaking the cosmic ice was to attract attention with a huge (preferably exploding) drawing. The German mathematician Carl Friedrich Gauss wanted to plant a visual proof of Pythagoras’ theorem, comprising a right-angled triangle bordered on each side by squares, in the Siberian tundra. The borders of the shapes were to be marked out by trees and their interiors filled with wheat: this would demonstrate to anyone able to view the diagram from space that humans had mastered both mathematics and agriculture. In Austria, Joseph von Littrow proposed digging trenches in the Sahara, filling them with kerosene and setting them ablaze. Charles Cros, a poet and inventor, petitioned the French government to fund the construction of a huge mirror capable of burning messages onto the Martian and Venusian deserts, while the will of Anne Goguet, a French socialite, left 100,000 francs to the Académie des sciences to be awarded to the first person to communicate successfully with aliens, with the proviso that they couldn’t be Martians, whose existence was already ‘sufficiently well known’. Tristan Bernard satirised the alien-seekers in a story in which humanity, on receiving an unintelligible message from Mars, writes huge messages across the Sahara: ‘I beg your pardon?’ ‘Nothing.’ ‘What are you making signs for then?’ ‘We’re not talking to you, we’re talking to the Saturnians.’

In 1896, the Victorian polymath Francis Galton published a short story in which he describes a message received from Mars – conveyed in a Morse code-like sequence of long and short pulses of light – that begins by illustrating basic mathematical principles, using them as the foundation for progressively more complicated ideas. This encapsulated the scientific community’s best idea of what a message from or to space should look like. Mathematics is the same throughout the universe (they assumed), so using mathematics as the foundation for the message, rather than flaming trenches, seemed a good way of making it universally intelligible. When Guglielmo Marconi started experimenting with radio in the 1890s, transmitting messages like Galton’s to outer space began to look like a realistic possibility. ‘That it is possible to transmit signals to Mars,’ Marconi said, ‘I know as surely as if I had a gun big enough or powder strong enough to shoot there,’ and he endorsed the mathematical style of message outlined in Galton’s story: ‘By sticking to mathematics over a number of years one might come to speech.’ The challenge of communicating with aliens by radio was taken up enthusiastically by Nikola Tesla, who claimed to have intercepted a signal from ‘another world, unknown and remote’. It began with counting: ‘One … two … three …’

I love “with the proviso that they couldn’t be Martians, whose existence was already ‘sufficiently well known’.” (We discussed the movie Arrival in 2016.)

Osias Jota, Yuval Pinter likes this

31 Aug 15:46

'הארץ': הפרקליטות משקרת, המחבלים אומרים אמת

by חנן עמיאור

החלטת הפרקליטות לסגור את התיק נגד הישראלי שדקר למוות פלסטיני ליד אריאל בחודש יוני, הכעיסה את ב. מיכאל עד מאד.

במאמר שפרסם אתמול ב'הארץ', קבע מיכאל כי החלטת הפרקליטות היא עוולה. כדי לתת תוקף לטענתו לעוולה, בחר לספר את סיפור המעשה כך:

הנה הסיפור: ב–21.6.22 נעץ מתנחל סכין בלבו של עלי חסן חַרְבּ, והרג אותו. האירוע התרחש בשטח השייך למשפחת הנדקר. המתנחל ומרעיו נכנסו לשטח הפרטי הזה, כנראה כהכנה להקמת "מאחז" כלשהו. כתום הדקירה נטל הדוקר את סכינו ונמלט

ובכן, לפי 'הארץ', עלי חסן חרב, המוצג כפלסטיני תמים, נדקר למוות על ידי מתנחל שחדר עם מרעיו לשטח הפרטי של משפחת הנדקר, באירוע חד סטרי של תוקפנות יהודית שרירותית. גם בהסגת הגבול וגם בדקירה למוות.

זאת אכן דרך אחת לספר את הסיפור. דרך שניה:

עלי חסן חרב לא היה פלסטיני תמים שנדקר כשהגן על "אדמת משפחתו" מפני פלישה. הוא היה מחבל, בן למשפחת מחבלים, ששני ארגוני טרור טענו שפעל בשורותיהם. חמאס, והחזית העממית לשחרור פלסטין. הוא גם זכה ללוויה צבאית השמורה לטרוריסטים הנהרגים במילוי משימתם.

יש גם דרך שלישית לספר את הסיפור:

יש סיבה לכך שהאירוע הסתיים כפי שהסתיים, ושהפרקליטות קיבלה לבסוף את טענת הדוקר להגנה עצמית: הפורעים הערבים, גם הם מחבלים מוכרים, הופיעו לעימות חמושים באלות ובגרזנים, וברשות המשטרה והפרקליטות ראיות (הקלטות) של הנערים היהודים קוראים לעזרת כוחות הביטחון ומדווחים על איום ממשי על חייהם.

ויש אפילו דרך רביעית:

"שטח השייך למשפחת הנדקר"? האירוע התרחש בשטח ליד הכניסה לאריאל, שהפלסטינים טוענים ששייך למשפחה. כמעט מיותר לציין שהפלסטינים לא הציגו שום מסמך המאשש את הטענה שהקרקע שייכת להם. למעשה, אפילו טענת הבעלות עצמה לא מדויקת:

עאתדל, דודתו של חרב, אמרה ל"הארץ" שמדובר בקרקע ובה עצי זית שהמשפחה נוהגת לקטוף מהם בתקופת המסיק

נסכם:

מדובר במתקפה אלימה, באמצעות אבנים, אלות וגרזנים, של קבוצת מחבלים מוכרים, חלקם אף מורשעים, על קבוצת בני נוער יהודים מלווים באנשי ביטחון. התקרית אירעה בשטח שהבעלות עליו לכל הפחות לא ברורה, סמוך לכניסה לעיר אריאל.

במהלך האירוע, שהתאפיין בסכנת חיים ממשית שחשו הנערים היהודים המגובה בראיות, נדקר למוות פלסטיני בן 27, שבדיעבר זוהה כמחבל.

לאור הראיות (עדויות והקלטות), המצביעות על סבירות גבוהה להגנה עצמית מצד הדוקר היהודי, הוא שוחרר והתיק נגדו נסגר.

ועכשיו, הנה שוב תיאור המעשה כפי שראו לנכון עורכי 'הארץ' לפרסם:

הנה הסיפור: ב–21.6.22 נעץ מתנחל סכין בלבו של עלי חסן חַרְבּ, והרג אותו. האירוע התרחש בשטח השייך למשפחת הנדקר. המתנחל ומרעיו נכנסו לשטח הפרטי הזה, כנראה כהכנה להקמת "מאחז" כלשהו. כתום הדקירה נטל הדוקר את סכינו ונמלט

'הארץ', בפשטות, מתעלם מכל הראיות ובניגוד להן מאמץ באופן מלא את עמדת ארגון הטרור חמאס, ודוחה את עמדת הפרקליטות. מה שמעלה את השאלה, האם מדובר בעיתון או בעלון תעמולה בשירות האויב?

The post 'הארץ': הפרקליטות משקרת, המחבלים אומרים אמת appeared first on פרספקטיבה.

Yuval Pinter

Shared posts

Making up more words for words

Did you know you can just make up new words if you want to?

Pattern recognition

Overall architecture

The “Averaging” and “Edging” operations

Why these operations?

The “Learning” part

Few closing remarks

אחת