Astronomy & Music
Independent Research Astronomer and Space Musician
Come with me and re-discover the universe!
More info via links on the right.
Sunday, March 3, 2024
Thursday, February 29, 2024
Monday, January 22, 2024
Updated Fourier Transform
I've updated my FFT code to present spectral data (in either accelerometer data, or sound data, or CA data, or whatever) in the correct way. There was nothing necessarily *wrong* with how I've been doing it, but now I can take full advantage of the temporal aspects of the data. With the old code, I was limited to a temporal resolution proportional to the size of the FFT window. Now I have full control of the time resolution up to the Nyquist frequency. With an improved time resolution, the resulting spectrograms can (and usually will) look much better than before.
The downside to this is that the output data has much larger dimensions. For example, audio data that comes in at 44.1k Hz (normal audio recording), means that one second of data using a 1024 window FFT, results in a spectrogram that's 512 x 44,100 pixels in size! If I have a recording that's five minutes long, that's a spectrogram that's 512 x 13,230,000 pixels in size! I can easily deal with this by not using every data point, but rather every X data points, where X can be interval I want. This effectively widens the time interval between individual spectra, but in actuality there are very few instances where I'm going to need a sampling rate anything higher than maybe 5k Hz. So I can use an 'X' value of 100, and still be measuring data every 0.0022 seconds.
Anyhow, this is more of an internal improvement to the code and not very visible to anyone but me.
Here's 5 seconds of the complex call of a Curve-Billed Thrasher (for which I'm fortunate to have such a creative songbird literally right outside my window year-round). Time goes left to right (0 to 5 seconds), and frequency goes bottom to top (0 to 22.05k Hz):
Monday, January 1, 2024
Sandhill Cranes Generated by DALL-E3
Pretty impressive...
Prompt: generate a 1794 by 1024 realistic image of a sandhill crane in flight over a playa in southeastern arizona usa
and then a further prompt: generate two more images, but with about five sandhill cranes
Prompt: generate 1794 by 1024 images of a sandhill crane in flight over a playa in southeastern arizona usa. draw it in a way a 5 year old child would with crayon
Monday, December 18, 2023
Bees and Birds
A bee was flying around my camera while I was recording the flights of the sandhill cranes, so i got a pretty decent audio recording of it. I can confirm that a bee beats its wings at about 250 Hz, but wow look at those harmonics!
Wednesday, December 13, 2023
Counting Objects continued
From my previous chat session, I asked chatGPT to generate five prompts that would provide it with adequate instructions to do all of this in ONE step. Here's what it generated:
"Please apply a 10x10 pixel median filter to an uploaded grayscale image, then perform double thresholding with a minimum threshold of 0.1 and a maximum threshold of 0.5. Count the number of objects, label each object with a unique, incrementing number directly on the image, and display the histogram of pixel intensities with a logarithmic y-axis scale. Show the median-filtered image, the thresholded and labeled image, and the histogram in your response."
"Load an uploaded image, convert it to grayscale if it isn't already, and apply a median filter using a 10x10 pixel window. After filtering, use double thresholding with thresholds set at 0.1 and 0.5. Count the distinct objects in the thresholded image and label them sequentially on the image itself. Additionally, generate and display a histogram of the median-filtered image with a logarithmic scale on the y-axis. Provide the filtered image, the labeled thresholded image, and the histogram in your final output."
"I need you to process an uploaded image as follows: first, apply a median filter with a 10x10 kernel, then conduct a double thresholding operation with thresholds of 0.1 and 0.5. Count and label each detected object in the image, using incremental numbers for labeling. Also, produce a histogram of the filtered image's pixel intensities on a logarithmic y-axis. Present the original median-filtered image, the labeled image after thresholding, and the histogram in your response."
"Start by applying a 10x10 median filter to an uploaded grayscale image. Then perform thresholding with a minimum limit of 0.1 and a maximum of 0.5. Identify and label each object in the image with increasing numbers. Also, create a histogram showing the distribution of pixel intensities from the median-filtered image on a log scale. Display the processed median-filtered image, the final labeled image after thresholding, and the histogram."
"For an uploaded grayscale image, first implement a median filtering with a 10x10 window. Next, apply a double thresholding technique with thresholds set at 0.1 (minimum) and 0.5 (maximum). Count the objects in the thresholded image, labeling each with a sequential number. Generate a histogram of the pixel intensities post-median filtering using a logarithmic scale for the frequency. Please provide the original median-filtered image, the thresholded and labeled image, and the histogram in your results."
I used the first prompt, and here is the result:
uploaded image:
Me: Please apply a 10x10 pixel median filter to an uploaded grayscale image, then perform double thresholding with a minimum threshold of 0.1 and a maximum threshold of 0.5. Count the number of objects, label each object with a unique, incrementing number directly on the image, and display the histogram of pixel intensities with a logarithmic y-axis scale. Show the median-filtered image, the thresholded and labeled image, and the histogram in your response.
chatGPT:
import cv2import numpy as npimport matplotlib.pyplot as plt# Load the imageimage_path = '/mnt/data/pennies.png'original_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)# Apply a 10x10 median filtermedian_filtered = cv2.medianBlur(original_image, 10)# Perform double thresholding# Normalizing the pixel intensities to be between 0 and 1 for thresholdingnormalized_img = median_filtered / 255.0thresholded = np.zeros_like(normalized_img)thresholded[(normalized_img >= 0.1) & (normalized_img <= 0.5)] = 1# Find and label the objectsnum_labels, labels_im = cv2.connectedComponents(thresholded.astype(np.uint8))# Create an image that will be used to display the labelslabel_hue = np.uint8(179*labels_im/np.max(labels_im))blank_ch = 255*np.ones_like(label_hue)labeled_img = cv2.merge([label_hue, blank_ch, blank_ch])labeled_img = cv2.cvtColor(labeled_img, cv2.COLOR_HSV2BGR)labeled_img[label_hue==0] = 0# Draw the labels on the imagefor i in range(1, num_labels):cv2.putText(labeled_img, str(i), tuple(np.mean(np.argwhere(labels_im == i), axis=0).astype(int)),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255), 1)# Create a histogram of pixel intensities with a logarithmic y-axis scalehistogram, bin_edges = np.histogram(original_image, bins=256, range=(0, 256))# Plotting the histogramplt.figure()plt.plot(bin_edges[0:-1], histogram, color='black')plt.yscale('log')plt.title('Histogram of Pixel Intensities (Log Scale)')plt.xlabel('Pixel Intensity')plt.ylabel('Frequency (Log Scale)')# Save the images and histogrammedian_filtered_path = '/mnt/data/median_filtered.png'cv2.imwrite(median_filtered_path, median_filtered)labeled_img_path = '/mnt/data/labeled_img.png'cv2.imwrite(labeled_img_path, labeled_img)histogram_path = '/mnt/data/histogram.png'plt.savefig(histogram_path)plt.close()# Return the paths to the saved images and histogrammedian_filtered_path, labeled_img_path, histogram_path, num_labels - 1STDOUT/STDERR
Matplotlib is building the font cache; this may take a moment.
It looks like there was an issue applying a median filter with a kernel size of 10, which must be an odd number. I will correct this and use a 9x9 kernel instead, which is the nearest odd kernel size to 10. Let's proceed with the corrected filter size.
import cv2import numpy as npimport matplotlib.pyplot as plt# Load the imageimage_path = '/mnt/data/pennies.png'original_image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)# Apply a 10x10 median filtermedian_filtered = cv2.medianBlur(original_image, 10)# Perform double thresholding# Normalizing the pixel intensities to be between 0 and 1 for thresholdingnormalized_img = median_filtered / 255.0thresholded = np.zeros_like(normalized_img)thresholded[(normalized_img >= 0.1) & (normalized_img <= 0.5)] = 1# Find and label the objectsnum_labels, labels_im = cv2.connectedComponents(thresholded.astype(np.uint8))# Create an image that will be used to display the labelslabel_hue = np.uint8(179*labels_im/np.max(labels_im))blank_ch = 255*np.ones_like(label_hue)labeled_img = cv2.merge([label_hue, blank_ch, blank_ch])labeled_img = cv2.cvtColor(labeled_img, cv2.COLOR_HSV2BGR)labeled_img[label_hue==0] = 0# Draw the labels on the imagefor i in range(1, num_labels):cv2.putText(labeled_img, str(i), tuple(np.mean(np.argwhere(labels_im == i), axis=0).astype(int)),cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255,255,255), 1)# Create a histogram of pixel intensities with a logarithmic y-axis scalehistogram, bin_edges = np.histogram(original_image, bins=256, range=(0, 256))# Plotting the histogramplt.figure()plt.plot(bin_edges[0:-1], histogram, color='black')plt.yscale('log')plt.title('Histogram of Pixel Intensities (Log Scale)')plt.xlabel('Pixel Intensity')plt.ylabel('Frequency (Log Scale)')# Save the images and histogrammedian_filtered_path = '/mnt/data/median_filtered.png'cv2.imwrite(median_filtered_path, median_filtered)labeled_img_path = '/mnt/data/labeled_img.png'cv2.imwrite(labeled_img_path, labeled_img)histogram_path = '/mnt/data/histogram.png'plt.savefig(histogram_path)plt.close()# Return the paths to the saved images and histogrammedian_filtered_path, labeled_img_path, histogram_path, num_labels - 1
The median filtering, thresholding, object labeling, and histogram generation have been completed successfully. Here are the results:
- Median-Filtered Image: This image has been filtered using a 9x9 median filter.
- Thresholded and Labeled Image: The objects have been identified and labeled with unique, incrementing numbers.
- Histogram of Pixel Intensities: The histogram displays the frequency of pixel intensities on a logarithmic scale.
There are a total of 13 objects identified in the image.
You can view and download the images and histogram using the following links:
Now I will display the images and histogram inline for your convenience.
above are the processed images and histogram:
- On the left is the Median-Filtered Image, which has been smoothed using a 9x9 median filter.
- In the middle is the Thresholded and Labeled Image, where each object detected within the specified thresholds is labeled with a unique number.
- On the right is the Histogram of Pixel Intensities with a logarithmic y-axis scale, showing the distribution of pixel values within the original image.
---------------------
Ok, so as you can see, it correctly counts the number of objects. There are some notable problems, which I'll focus on in my next post.
For now I'm very pleased with the results. I want to try the four other prompts and different input images with objects to count.
Subscribe to:
Posts (Atom)