2024-12-02 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Mastodon #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, communick.news, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post

2024-11-25 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Mastodon #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, communick.news, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post

Remove filler words at the start and upcase the next word

| audio, speechtotext, emacs

[2024-11-21 Thu]: Fixed the second filler words regexp, and make it work at the start of lines too. Thanks to @arialdo@mastodon.online for the feedback!

Like many people, I tend to use "So", "And", "You know", and "Uh" to bridge between sentences when thinking. WhisperX does a reasonable job of detecting sentences and splitting them up anyway, but it leaves those filler words in at the start of the sentence. I usually like to remove these from transcripts so that they read more smoothly.

Here's a short Emacs Lisp function that removes those filler words when they start a sentence, capitalizing the next word. When called interactively, it prompts while displaying an overlay. When called from Emacs Lisp, it changes without asking for confirmation.

(defvar my-filler-words-regexp "\\(\\. \\|^\\)\\(?:So?\\|And\\|You know\\|Uh\\)\\(?:,\\|\\.\\.\\.\\)? \\(.\\)")
(defun my-remove-filler-words-at-start ()
  (interactive)
  (save-excursion
    (let ((case-fold-search nil))
      (while (re-search-forward my-filler-words-regexp nil t)
        (if (and (called-interactively-p) (not current-prefix-arg))
            (let ((overlay (make-overlay (match-beginning 0)
                                         (match-end 0))))
              (overlay-put overlay 'common-edit t)
              (overlay-put
               overlay 'display
               (propertize (concat (match-string 0) " -> "
                                   (match-string 1)
                                   (upcase (match-string 2)))
                           'face 'modus-themes-mark-sel))
              (unwind-protect
                  (pcase (save-match-data (read-char-choice "Replace (y/n/!/q)? " "yn!q"))
                    (?!
                     (replace-match (concat (match-string 1) (upcase (match-string 2))) t)
                     (while (re-search-forward my-filler-words-regexp nil t)
                       (replace-match (concat (match-string 1) (upcase (match-string 2))) t)))
                    (?y
                     (replace-match (concat (match-string 1) (upcase (match-string 2))) t))
                    (?n nil)
                    (?q (goto-char (point-max))))
                (delete-overlay overlay)))
          (replace-match (concat (match-string 1) (upcase (match-string 2))) t))))))
This is part of my Emacs configuration.
View org source for this post

Wednesday weblog: week ending November 20, 2024

| weekly, weblog, review
  • Reflection on writing style - 2024-11-18T00:44:18.080Z

    I notice that I have a lot more fun writing tiny workflow tweaks (mostly #Emacs ) and sharing them on my blog versus, say, insightful reflections developed over a longer period of time. I think it's the payoff of being able to enjoy those tweaks. Sometimes abstract thoughts help me come to realizations that I can then try to use to change my concrete behaviours, but it's a lot less straightforward.

    Also, I notice that I prefer to write with a curious, exploratory tone instead of an authoritative one, which is probably also related to my focus on "I" rather than "you". Kinda like: here's what I'm experimenting with, sharing in case it's helpful (and also because I want to be able to find it again), everyone's different and that's awesome, curious about what works for you. :) I'm glad other people can pull off being authoritative/persuasive, though.

    23+ years #blogging and still learning more!

  • Sketchnote blogs - 2024-11-17T18:19:47.826Z

    I'm surprised by how few active blogs I could find about #sketchnotes (or had a category feed for sketchnotes). It's mostly rohdesign and Verbal to Visual, I think. Sketchnote Army still comes out with episodes, but the posts themselves don't seem to be very visual, so people have to click through to the person's website. I guess a lot of people are on Instagram, but that doesn't seem to support RSS any more, and I'm not really keen on scrolling through that. Ah well!

  • dark mode sketch filter - 2024-11-14T13:41:01.165Z

    I tweaked my dark-mode sketch CSS rule thanks to stefanvdwalt's comment. Now I've got

      @media (prefers-color-scheme: dark) {
      .sketch-full img, .gallery img, .left-doodle, .right-doodle,
      .center-doodle { filter: invert(1) hue-rotate(180deg) brightness(150%)
      contrast(0.9); }
      }
    

    Updated: https://sachachua.com/blog/2024/11/using-a-coloured-template-on-my-supernote-a5x/

  • Researched BBB hosting options and compared the costs with self-hosting on Linode.
  • Checked the shell scripts to make sure that hosts can start the videos by using shortcuts.

Quotes

  • Excerpts from Rebecca Solnit's "A Field Guide to Getting Lost" (2006) - 2024-11-19T12:52:11.878Z

    One of the books that has just arrived from the library is "A Field Guide to Getting Lost" (Rebecca Solnit, 2006), which was recommended to me by @janoli .

    Here are some snippets that have resonated with me so far:

    p5. Love, wisdom, grace, inspiration–how do you go about finding these things that are in some ways about extending the boundaries of the self into unknown territory, about becoming someone else?

    p10. and there's another art of being at home in the unknown, so that being in its midst isn't cause for panic or suffering, of being at home with being lost.

    p14. The historian Aaron Sachs, about explorers: "In my opinion, their most important skill was simply a sense of optimism about surviving and finding their way."

    p80. Even in the everyday world of the present, an anxiety to survive manifests itself in cars and clothes for far more rugged occasions than those at hand, as though to express some sense of the toughness of things and of readiness to face them. But the real difficulties, the real arts of survival, seem to lie in more subtle realms. There, what's called for is a kind of resilience of the psyche, a readiness to deal with what comes next.

    p99. Probably it had its origins in protective urges, but it had gone sour long ago.

  • Excerpts from Bill Watterson's speech at Kenyon College in 1990 - 2024-11-19T12:19:15.286Z

    Thanks to @kims for sharing Bill Watterson's speech at Kenyon College, Gambier Ohio, to the 1990 graduating class (https://web.mit.edu/jmorzins/www/C-H-speech.html)

    This section particularly resonated with me: "Creating a life that reflects your values and satisfies your soul is a rare achievement. In a culture that relentlessly promotes avarice and excess as the good life, a person happy doing his own work is usually considered an eccentric, if not a subversive. Ambition is only understood if it's to rise to the top of some imaginary ladder of success. Someone who takes an undemanding job because it affords him the time to pursue other interests and activities is considered a flake. A person who abandons a career in order to stay home and raise children is considered not to be living up to his potential-as if a job title and salary are the sole measure of human worth."

    I also appreciated his resistance to commercializing Calvin & Hobbes:
    "Selling out is usually more a matter of buying in. Sell out, and you're really buying into someone else's system of values, rules and rewards.
    The so-called 'opportunity' I faced would have meant giving up my individual voice for that of a money-grubbing corporation. It would have meant my purpose in writing was to sell things, not say things. My pride in craft would be sacrificed to the efficiency of mass production and the work of assistants. Authorship would become committee decision. Creativity would become work for pay. Art would turn into commerce. In short, money was supposed to supply all the meaning I'd need.
    What the syndicate wanted to do, in other words, was turn my comic strip into everything calculated, empty and robotic that I hated about my old job. They would turn my characters into television hucksters and T-shirt sloganeers and deprive me of characters that actually expressed my own thoughts."

Other links

View org source for this post

Updating my audio braindump workflow to take advantage of WhisperX

| emacs, speechtotext, org

I get word timestamps for free when I transcribe with WhisperX, so I can skip the Aeneas alignment step. That means I can update my previous code for handling audio braindumps . Breaking the transcript up into sections Also, I recently updated subed-word-data to colour words based on their transcription score, which draws my attention to things that might be uncertain.

Here's what it looks like when I have the post, the transcript, and the annotated PDF.

2024-11-17_20-44-30.png
Figure 1: Screenshot of draft, transcript, and PDF

Here's what I needed to implement my-audio-braindump-from-whisperx-json (plus some code from my previous audio braindump workflow):

(defun my-whisperx-word-list (file)
  (let* ((json-object-type 'alist)
         (json-array-type 'list))
    (seq-mapcat (lambda (seg)
                  (alist-get 'words seg))
                (alist-get 'segments (json-read-file file)))))

;; (seq-take (my-whisperx-word-list (my-latest-file "~/sync/recordings" "\\.json")) 10)
(defun my-whisperx-insert-word-list (words)
  "Inserts WORDS with text properties."
  (require 'subed-word-data)
  (mapc (lambda (word)
            (let ((start (point)))
              (insert
               (alist-get 'word word))
              (subed-word-data--add-word-properties start (point) word)
              (insert " ")))
        words))

(defun my-audio-braindump-turn-sections-into-headings ()
  (interactive)
  (goto-char (point-min))
  (while (re-search-forward "START SECTION \\(.+?\\) STOP SECTION" nil t)
    (replace-match
     (save-match-data
       (format
        "\n*** %s\n"
        (save-match-data (string-trim (replace-regexp-in-string "^[,\\.]\\|[,\\.]$" "" (match-string 1))))))
     nil t)
    (let ((prop-match (save-excursion (text-property-search-forward 'subed-word-data-start))))
      (when prop-match
        (org-entry-put (point) "START" (format-seconds "%02h:%02m:%02s" (prop-match-value prop-match)))))))

(defun my-audio-braindump-split-sentences ()
  (interactive)
  (goto-char (point-min))
  (while (re-search-forward "[a-z]\\. " nil t)
    (replace-match (concat (string-trim (match-string 0)) "\n") )))

(defun my-audio-braindump-restructure ()
  (interactive)
  (goto-char (point-min))
  (my-subed-fix-common-errors)
  (org-mode)
  (my-audio-braindump-prepare-alignment-breaks)
  (my-audio-braindump-turn-sections-into-headings)
  (my-audio-braindump-split-sentences)
  (goto-char (point-min))
  (my-remove-filler-words-at-start))

(defun my-audio-braindump-from-whisperx-json (file)
  (interactive (list (read-file-name "JSON: " "~/sync/recordings/" nil nil nil (lambda (f) (string-match "\\.json\\'" f)))))
  ;; put them all into a buffer
  (with-current-buffer (get-buffer-create "*Words*")
    (erase-buffer)
    (fundamental-mode)
    (my-whisperx-insert-word-list (my-whisperx-word-list file))
    (my-audio-braindump-restructure)
    (goto-char (point-min))
    (switch-to-buffer (current-buffer))))

(defun my-audio-braindump-process-text (file)
  (interactive (list (read-file-name "Text: " "~/sync/recordings/" nil nil nil (lambda (f) (string-match "\\.txt\\'" f)))))
  (with-current-buffer (find-file-noselect file)
    (my-audio-braindump-restructure)
    (save-buffer)))
;; (my-audio-braindump-from-whisperx-json (my-latest-file "~/sync/recordings" "\\.json"))

Ideas for next steps:

  • I can change my processing script to split up the Whisper TXT into sections and automatically make the PDF with nice sections.
  • I can add reminders and other callouts. I can style them, and I can copy reminders into a different section for easier processing.
  • I can look into extracting PDF annotations so that I can jump to the next highlight or copy highlighted text.
This is part of my Emacs configuration.
View org source for this post

Looking at my blog post stats by year

| blogging
blog-stats.svg
Figure 1: Blog statistics

I was curious about the shape of my blog over the years, excluding Emacs News and my link-heavy weekly/monthly reviews. It started off with lots of little posts like the way other weblogs were also quick links and notes. As weblogs morphed into blogs with more text, I also settled down into fewer, longer posts with lots of code (analyzed by looking for <pre> blocks). I wrote much less after A+ was born. Interestingly, I've been shifting towards longer posts with more images.

  • Blog posts exclude permalinks that match emacs-news|review|week-ending, which casts a bit of a wide net but should give me the general shape of things.
  • Total words per year and average words per post both exclude code snippets.

Here's how I got those numbers:

(append
 '(("Year" "Posts" "Total words" "Words per post" "Posts with pre" "Posts with images")
   hline)
 (cl-loop for i from 2001 to 2024
          collect
          (let* ((default-directory (expand-file-name (number-to-string i) "~/proj/static-blog/blog"))
                 (exclude (shell-quote-argument "emacs-news|review|week-ending"))
                 (files (format "find . -name '*.html' | grep -v -e '%s' | " exclude))
                 (posts (string-to-number
                         (string-trim
                          (shell-command-to-string (concat files "wc -l")))))
                 (words (string-to-number
                         (replace-regexp-in-string
                          "TOTAL: " ""
                          (shell-command-to-string
                           (concat files "xargs ~/bin/count-words | grep TOTAL")))))
                 (posts-with-images
                  (string-to-number
                   (string-trim
                    (shell-command-to-string (concat files "xargs grep -l '<img' | wc -l")))))
                 (posts-with-pre
                  (string-to-number
                   (string-trim
                    (shell-command-to-string (concat files "xargs grep -l '<pre' | wc -l"))))))
            (list i
                  posts
                  words
                  (/ words posts)
                  posts-with-images
                  posts-with-pre))))
Year Posts Total words Words per post Posts with pre Posts with images
2001 3 438 146 0 0
2002 31 4336 139 0 0
2003 863 64953 75 0 59
2004 967 125789 130 2 98
2005 679 135334 199 4 40
2006 869 171042 196 19 42
2007 489 107011 218 33 32
2008 380 121158 318 85 57
2009 400 175692 439 81 20
2010 335 160289 478 93 19
2011 324 163274 503 93 28
2012 286 124300 434 111 12
2013 273 173021 633 172 11
2014 272 186788 686 138 30
2015 173 133682 772 82 36
2016 25 11560 462 13 6
2017 37 24063 650 6 2
2018 66 46827 709 7 8
2019 18 13054 725 3 6
2020 13 6791 522 4 5
2021 31 17389 560 8 16
2022 21 11264 536 4 9
2023 68 47188 693 26 52
2024 74 58439 789 27 40

And here's how I plotted the charts:

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data[1:], columns=data[0])
# Create a figure with subplots
fig, (ax1, ax4, ax2, ax3) = plt.subplots(4, 1, figsize=(10, 12))
fig.suptitle('Blog Statistics by Year', fontsize=16)

# Plot Posts
ax1.bar(df['Year'], df['Posts'], color='lightblue', label='Other posts')
ax1.bar(df['Year'], df['Posts with pre'] , color='darkblue', label='With preformatted blocks')
ax1.set_title('Number of posts per year')
ax1.set_ylabel('Posts')
ax1.legend()

# Plot Posts
ax4.bar(df['Year'], df['Posts'], color='lightblue', label='Other posts')
ax4.bar(df['Year'], df['Posts with images'] , color='darkgreen', label='With images')
ax4.set_title('Number of posts per year')
ax4.set_ylabel('Posts')
ax4.legend()

# Plot Total Words
ax2.bar(df['Year'], df['Total words'], color='lightblue')
ax2.set_title('Total words per year')
ax2.set_ylabel('Total words')

# Plot Words per Post
ax3.bar(df['Year'], df['Words per post'], color='lightblue')
ax3.set_title('Average words per post')
ax3.set_ylabel('Words per post')
ax3.set_xlabel('Year')

# Adjust layout and display
plt.savefig(f)
View org source for this post

2024-11-18 Emacs news

| emacs, emacs-news

Links from reddit.com/r/emacs, r/orgmode, r/spacemacs, r/planetemacs, Mastodon #emacs, Hacker News, lobste.rs, programming.dev, lemmy.world, lemmy.ml, communick.news, planet.emacslife.com, YouTube, the Emacs NEWS file, Emacs Calendar, and emacs-devel. Thanks to Andrés Ramírez for emacs-devel links. Do you have an Emacs-related link or announcement? Please e-mail me at sacha@sachachua.com. Thank you!

View org source for this post