Monday, January 22, 2024

Updated Fourier Transform

I've updated my FFT code to present spectral data (in either accelerometer data, or sound data, or CA data, or whatever) in the correct way.  There was nothing necessarily *wrong* with how I've been doing it, but now I can take full advantage of the temporal aspects of the data.  With the old code, I was limited to a temporal resolution proportional to the size of the FFT window.  Now I have full control of the time resolution up to the Nyquist frequency.  With an improved time resolution, the resulting spectrograms can (and usually will) look much better than before.

The downside to this is that the output data has much larger dimensions.  For example, audio data that comes in at 44.1k Hz (normal audio recording), means that one second of data using a 1024 window FFT, results in a spectrogram that's 512 x 44,100 pixels in size!  If I have a recording that's five minutes long, that's a spectrogram that's 512 x 13,230,000 pixels in size!  I can easily deal with this by not using every data point, but rather every X data points, where X can be interval I want.  This effectively widens the time interval between individual spectra, but in actuality there are very few instances where I'm going to need a sampling rate anything higher than maybe 5k Hz.  So I can use an 'X' value of 100, and still be measuring data every 0.0022 seconds.

Anyhow, this is more of an internal improvement to the code and not very visible to anyone but me.

Here's 5 seconds of the complex call of a Curve-Billed Thrasher (for which I'm fortunate to have such a creative songbird literally right outside my window year-round).  Time goes left to right (0 to 5 seconds), and frequency goes bottom to top (0 to 22.05k Hz):



Listening to the sounds and looking at the spectrograms has got me very interested in trying to make sense of these sounds.  It's automatic for me: start running the scientific method on this stuff and see where it leads me.  LOL like to many other things, probably nowhere!

Monday, January 1, 2024

Sandhill Cranes Generated by DALL-E3

Pretty impressive...

Prompt: generate a 1794 by 1024 realistic image of a sandhill crane in flight over a playa in southeastern arizona usa





and then a further prompt: generate two more images,  but with about five sandhill cranes



Prompt: generate 1794 by 1024 images of a sandhill crane in flight over a playa in southeastern arizona usa. draw it in a way a 5 year old child would with crayon



Prompt: please make them more simplistic and drawn by a child under the age of five



prompt: please generate two more images in the same style, but draw a small flock of sandhill cranes



 I really dig the shadows on that last one.  The creation of a 5-year-old child?  Sure!



Monday, December 18, 2023

Bees and Birds

A bee was flying around my camera while I was recording the flights of the sandhill cranes, so i got a pretty decent audio recording of it.  I can confirm that a bee beats its wings at about 250 Hz, but wow look at those harmonics!



Birds...

















Wednesday, December 13, 2023

Murmurations Dawn 13 December 2023







Counting Objects continued

From my previous chat session, I asked chatGPT to generate five prompts that would provide it with adequate instructions to do all of this in ONE step.  Here's what it generated:

"Please apply a 10x10 pixel median filter to an uploaded grayscale image, then perform double thresholding with a minimum threshold of 0.1 and a maximum threshold of 0.5. Count the number of objects, label each object with a unique, incrementing number directly on the image, and display the histogram of pixel intensities with a logarithmic y-axis scale. Show the median-filtered image, the thresholded and labeled image, and the histogram in your response."

"Load an uploaded image, convert it to grayscale if it isn't already, and apply a median filter using a 10x10 pixel window. After filtering, use double thresholding with thresholds set at 0.1 and 0.5. Count the distinct objects in the thresholded image and label them sequentially on the image itself. Additionally, generate and display a histogram of the median-filtered image with a logarithmic scale on the y-axis. Provide the filtered image, the labeled thresholded image, and the histogram in your final output."

"I need you to process an uploaded image as follows: first, apply a median filter with a 10x10 kernel, then conduct a double thresholding operation with thresholds of 0.1 and 0.5. Count and label each detected object in the image, using incremental numbers for labeling. Also, produce a histogram of the filtered image's pixel intensities on a logarithmic y-axis. Present the original median-filtered image, the labeled image after thresholding, and the histogram in your response."

"Start by applying a 10x10 median filter to an uploaded grayscale image. Then perform thresholding with a minimum limit of 0.1 and a maximum of 0.5. Identify and label each object in the image with increasing numbers. Also, create a histogram showing the distribution of pixel intensities from the median-filtered image on a log scale. Display the processed median-filtered image, the final labeled image after thresholding, and the histogram."

"For an uploaded grayscale image, first implement a median filtering with a 10x10 window. Next, apply a double thresholding technique with thresholds set at 0.1 (minimum) and 0.5 (maximum). Count the objects in the thresholded image, labeling each with a sequential number. Generate a histogram of the pixel intensities post-median filtering using a logarithmic scale for the frequency. Please provide the original median-filtered image, the thresholded and labeled image, and the histogram in your results."

I used the first prompt, and here is the result:

uploaded image:

Me: Please apply a 10x10 pixel median filter to an uploaded grayscale image, then perform double thresholding with a minimum threshold of 0.1 and a maximum threshold of 0.5. Count the number of objects, label each object with a unique, incrementing number directly on the image, and display the histogram of pixel intensities with a logarithmic y-axis scale. Show the median-filtered image, the thresholded and labeled image, and the histogram in your response.

chatGPT:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load the image
image_path = '/mnt/data/pennies.png'
original_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Apply a 10x10 median filter
median_filtered = cv2.medianBlur(original_image, 10)

# Perform double thresholding
# Normalizing the pixel intensities to be between 0 and 1 for thresholding
normalized_img = median_filtered / 255.0
thresholded = np.zeros_like(normalized_img)
thresholded[(normalized_img >= 0.1) & (normalized_img <= 0.5)] = 1

# Find and label the objects
num_labels, labels_im = cv2.connectedComponents(thresholded.astype(np.uint8))

# Create an image that will be used to display the labels
label_hue = np.uint8(179*labels_im/np.max(labels_im))
blank_ch = 255*np.ones_like(label_hue)
labeled_img = cv2.merge([label_hue, blank_ch, blank_ch])
labeled_img = cv2.cvtColor(labeled_img, cv2.COLOR_HSV2BGR)
labeled_img[label_hue==0] = 0

# Draw the labels on the image
for i in range(1, num_labels):
    cv2.putText(labeled_img, str(i), tuple(np.mean(np.argwhere(labels_im == i), axis=0).astype(int)),
                cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255), 1)

# Create a histogram of pixel intensities with a logarithmic y-axis scale
histogram, bin_edges = np.histogram(original_image, bins=256, range=(0, 256))

# Plotting the histogram
plt.figure()
plt.plot(bin_edges[0:-1], histogram, color='black')
plt.yscale('log')
plt.title('Histogram of Pixel Intensities (Log Scale)')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency (Log Scale)')

# Save the images and histogram
median_filtered_path = '/mnt/data/median_filtered.png'
cv2.imwrite(median_filtered_path, median_filtered)

labeled_img_path = '/mnt/data/labeled_img.png'
cv2.imwrite(labeled_img_path, labeled_img)

histogram_path = '/mnt/data/histogram.png'
plt.savefig(histogram_path)
plt.close()

# Return the paths to the saved images and histogram
median_filtered_path, labeled_img_path, histogram_path, num_labels - 1

STDOUT/STDERR
Matplotlib is building the font cache; this may take a moment.

It looks like there was an issue applying a median filter with a kernel size of 10, which must be an odd number. I will correct this and use a 9x9 kernel instead, which is the nearest odd kernel size to 10. Let's proceed with the corrected filter size. ​

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load the image
image_path = '/mnt/data/pennies.png'
original_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

# Apply a 10x10 median filter
median_filtered = cv2.medianBlur(original_image, 10)

# Perform double thresholding
# Normalizing the pixel intensities to be between 0 and 1 for thresholding
normalized_img = median_filtered / 255.0
thresholded = np.zeros_like(normalized_img)
thresholded[(normalized_img >= 0.1) & (normalized_img <= 0.5)] = 1

# Find and label the objects
num_labels, labels_im = cv2.connectedComponents(thresholded.astype(np.uint8))

# Create an image that will be used to display the labels
label_hue = np.uint8(179*labels_im/np.max(labels_im))
blank_ch = 255*np.ones_like(label_hue)
labeled_img = cv2.merge([label_hue, blank_ch, blank_ch])
labeled_img = cv2.cvtColor(labeled_img, cv2.COLOR_HSV2BGR)
labeled_img[label_hue==0] = 0

# Draw the labels on the image
for i in range(1, num_labels):
    cv2.putText(labeled_img, str(i), tuple(np.mean(np.argwhere(labels_im == i), axis=0).astype(int)),
                cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255), 1)

# Create a histogram of pixel intensities with a logarithmic y-axis scale
histogram, bin_edges = np.histogram(original_image, bins=256, range=(0, 256))

# Plotting the histogram
plt.figure()
plt.plot(bin_edges[0:-1], histogram, color='black')
plt.yscale('log')
plt.title('Histogram of Pixel Intensities (Log Scale)')
plt.xlabel('Pixel Intensity')
plt.ylabel('Frequency (Log Scale)')

# Save the images and histogram
median_filtered_path = '/mnt/data/median_filtered.png'
cv2.imwrite(median_filtered_path, median_filtered)

labeled_img_path = '/mnt/data/labeled_img.png'
cv2.imwrite(labeled_img_path, labeled_img)

histogram_path = '/mnt/data/histogram.png'
plt.savefig(histogram_path)
plt.close()

# Return the paths to the saved images and histogram
median_filtered_path, labeled_img_path, histogram_path, num_labels - 1

The median filtering, thresholding, object labeling, and histogram generation have been completed successfully. Here are the results:

  • Median-Filtered Image: This image has been filtered using a 9x9 median filter.
  • Thresholded and Labeled Image: The objects have been identified and labeled with unique, incrementing numbers.
  • Histogram of Pixel Intensities: The histogram displays the frequency of pixel intensities on a logarithmic scale.

There are a total of 13 objects identified in the image.

You can view and download the images and histogram using the following links:

Now I will display the images and histogram inline for your convenience. ​

above are the processed images and histogram:



  1. On the left is the Median-Filtered Image, which has been smoothed using a 9x9 median filter.
  2. In the middle is the Thresholded and Labeled Image, where each object detected within the specified thresholds is labeled with a unique number.
  3. On the right is the Histogram of Pixel Intensities with a logarithmic y-axis scale, showing the distribution of pixel values within the original image. ​

---------------------

Ok, so as you can see, it correctly counts the number of objects.  There are some notable problems, which I'll focus on in my next post.

For now I'm very pleased with the results.  I want to try the four other prompts and different input images with objects to count.