Das deutsche Python-Forum

`get_image_info` sieht immer noch falsch aus. Ähnlich wie urljoin gibt es auch ein urlsplit oder urlparse um eine URL wieder auseinander zu nehmen. Ein `return` tief verschachtelt in einer for-Schleife ist schwierig zu lesen.
Das Problem ist aber, dass falls kein a-Tag mit einem img-Tag gefunden wird, None zurückgeliefert wird statt (None, None, None). Das ist schlecht, weil unerwartet und den Fall prüfst Du aber beim Aufrufen auch nicht ab.
Warum ist file_size ein Float? Willst Du auch halbe Bytes verarbeiten können?

@__blackjack__: Danke für die Erklärungen. Sollte die Funktion `download_site()` eine Exception (vielleicht einen `ConnectionError`) zurückgeben, wenn keine Seite heruntergeladen werden konnte?

@Sirius3: Danke für die Erklärungen. Die Funktion `get_image_info()` habe ich weiter angepasst.
Für `urlsplit` oder `urlparse` habe ich noch keinen Beispielcode gefunden, wie man einfach und ohne verschachtelte reguläre Ausdrücke an den Dateinamen kommt. Wie würdest du es machen?

Aktuelle Version der Funktion `get_image_info()` [der restliche Code hat sich nicht geändert]:

Code: Alles auswählen

def get_image_info(element, source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')
    print("Tags:", tags)  # Liste ist immer leer :(

    if tags:
        for tag in tags:
            if tag.find("img"):
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))

                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size

Gruß
Atalanttore

@Atalanttore: jetzt hast Du noch das Problem, dass im Falle dass kein a-Tag ein img-Tag enthält keine Warnung ausgegeben wird.
Der Parameter `element` wird gar nicht benutzt.

Auch an anderen Stellen benutzt Du Rückgabewerte (None) wo es besser wäre Exceptions zu benutzen. `exit` sollte in einem sauberen Programm gar nicht vorkommen, weil bei man für solche Funktionen gar keine Fehlerbehandlung machen kann.

@Sirius3: Danke für die Vorschläge.

Durch das Auskommentieren eines Loggers, der `response.read()` in der Funktion `download_site()` aufruft, gibt die Funktion nun auch den heruntergeladenen HTML-Quellcode zurück. Warum ist das so?

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
SEED_IMAGES = 10
SHOW_DEBUG = False

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():
    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info(f"Using automatically detected path: {new_path}")
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        #logger.info(f"Response: {response.read()}")
        reply = response.read().decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = "error"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info(text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info(text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')
    print("Tags:", tags)  # Liste ist immer leer :(

    if tags:
        for tag in tags:
            if tag.find("img"):
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))

                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info("Resizing the image from", image.size[0], "x", image.size[1], "to", DEFAULT_RESOLUTION_X, "x", DEFAULT_RESOLUTION_Y)
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < SEED_IMAGES:
        # Let's seed some images
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading some seed images as well")
        days_back = 0
        seed_images_left = SEED_IMAGES
        while seed_images_left > 0:
            days_back += 1
            logger.info(f"Downloading seed image ({seed_images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            seed_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            seed_site_contents = download_site(seed_filename)

            # Make sure we didn't encounter an error for some reason
            if seed_site_contents == "error":
                logger.error("Seed site contains an error")
                continue

            seed_filename = get_image(seed_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if seed_filename is None:
                continue

            resize_image(seed_filename)

            # Add this to our list of images
            images.append(seed_filename)
            seed_images_left -= 1
        logger.info("Done downloading seed images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")

Gruß
Atalanttore

@Atalanttore: Weil `response.read()` die gesamte Antwort liest. Die ist dann ”weg”, wie das bei Dateien so üblich ist.

@__blackjack__: Danke, dass wusste ich noch nicht. Mit dieser Info bin ich bei der Programmausführung nun wieder ein Stück weiter gekommen.

Es erscheinen nun folgende Fehlermeldungen nachdem insgesamt 4 Bilder heruntergeladen wurden:

Code: Alles auswählen

2019-07-13 17:57:29,137 __main__: Done downloading images
Traceback (most recent call last):
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 384, in <module>
    filename = create_desktop_background_scroll(filename)
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 353, in create_desktop_background_scroll
    xml_tree.write(filename, pretty_print=True)
  File "src/lxml/etree.pyx", line 2039, in lxml.etree._ElementTree.write
  File "src/lxml/serializer.pxi", line 721, in lxml.etree._tofilelike
  File "src/lxml/serializer.pxi", line 780, in lxml.etree._create_output_buffer
  File "src/lxml/serializer.pxi", line 770, in lxml.etree._create_output_buffer
PermissionError: [Errno 13] Permission denied

Warum scheitert es an einer fehlenden Berechtigung?

Im Code wird an mehreren Stellen der Wert einer Konstante mit dem Rückgabewert einer Funktion ersetzt.
Z.B.:
Code: Alles auswählen
```
DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()
```
Was ist davon zu halten?

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
COUNT_IMAGES_FROM_PREVIOUS_DAYS = 3

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():  # TODO: Überhaupt notwendig?

    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info(f"Using automatically detected path: {new_path}")
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        site = response.read()
        logger.info(f"Response: {site}")
        reply = site.decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = "error"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info(text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None  # TODO: Exception benutzen

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info(text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None  # TODO: Exception benutzen
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')

    if tags:
        for tag in tags:
            if tag.find("img"):  # TODO: Warnung ausgegeben, wenn kein a-Tag ein img-Tag enthält
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))
                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info(f"Resizing the image from {image.size[0]} x {image.size[1]} to {DEFAULT_RESOLUTION_X} x {DEFAULT_RESOLUTION_Y}")
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < COUNT_IMAGES_FROM_PREVIOUS_DAYS:
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading images of previous days as well")

        days_back = 0
        images_left = COUNT_IMAGES_FROM_PREVIOUS_DAYS

        while images_left > 0:
            days_back += 1
            logger.info(f"Downloading image ({images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            archive_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            archive_site_contents = download_site(archive_filename)

            # Make sure we didn't encounter an error for some reason
            if archive_site_contents == "error":
                logger.error("Archive site contains an error")
                continue

            archive_filename = get_image(archive_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if archive_filename is None:
                continue

            resize_image(archive_filename)

            # Add this to our list of images
            images.append(archive_filename)
            images_left -= 1
        logger.info("Done downloading images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")

Gruß
Atalanttore

@Atalanttore: Lass Dir doch mal ausgeben was da versucht wird wohin zu speichern.

@__blackjack__: Danke für den Tipp. Nach `/nasa_apod_desktop_backgrounds.xml` mit Benutzerrechten speichern funktioniert natürlich nicht.

Vor dem Dateinamen habe ich nun den Schrägstrich entfernt und die Zeile mit dem zu `path_with_filename` geänderten Bezeichner sieht nun so aus.

Code: Alles auswählen

path_with_filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, 'nasa_apod_desktop_backgrounds.xml')

Nun hat sich erstmalig das Desktop-Hintergrundbild geändert und es erscheint jetzt ein heruntergeladenes Astronomisches Bild des Tages. Der zeitgesteuerte Wechsel der Bilder funktioniert allerdings noch nicht. Die Auflösung des Bildes ist auch nicht ideal und das Seitenverhältnis wurde nicht beibehalten.

Ist es überhaupt notwendig, die Größe der heruntergeladenen Bilder mit der Funktion `resize_image()` an die Bildschirmauflösung anzupassen?

Gruß
Atalanttore

__blackjack__ hat geschrieben: Sonntag 7. Juli 2019, 21:55 Dann wird auch wieder etwas aus `os.path` mit einer URL verwendet. Das funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind, und es fällt auch auf Systemen wo Pfade und URLs sich ähneln auf die Nase wenn die URL noch einen „query“ und/oder „fragment“ Anteil besitzt.

Was ist der Unterschied zwischen Pfade und URLs (auf einem Linux-System)?
`os.path.basename(file_url)` funktioniert zwar, aber soll nicht immer funktionieren.
Wie extrahiert man am besten (pythonischten) den Dateiname aus einer URL?

Gruß
Atalanttore

Mit https://docs.python.org/3/library/urlli ... llib.parse - und URLs können Hosts, Ports, Usernamen, Passwörter, Protokolle, Parameter enthalten. Hast du sowas schon mal bei Dateien gesehen?

@__deets__: So wie ich es jetzt verstanden habe, wird ein Pfad zur URL, wenn er mehr als Verzeichnisnamen und Dateinamen enthält.

Gruß
Atalanttore

Naja. Finde ich eine komische Sichtweise. Pfade und URLs haben Überlappungen. Aber das eine ist keine untermenge des anderen. Zb gibt es AFAIK kein Verzeichnis hoch in URLs

@__deets__: Mit "Verzeichnis hoch" meinst du den Befehl oder etwas anderes?

Nach weiterer Suche bin ich nun auf eine Funktion zur Extraktion des Dateinamens aus einer URL gestoßen.

Ich habe den Code der Funktion auf das Wesentliche gekürzt:

Code: Alles auswählen

def url2filename(url):
    """
    Return basename corresponding to url.
    Based on https://gist.github.com/zed/c2168b9c52b032b5fb7d
    """

    url_path = urllib.parse.urlsplit(url).path
    basename = posixpath.basename(urllib.parse.unquote(url_path))

    if (os.path.basename(basename) != basename or urllib.parse.unquote(posixpath.basename(url_path)) != basename):
        raise ValueError  # reject '%2f' or 'dir%5Cbasename.ext' on Windows
    return basename

Ist diese Funktion ein besserer Ansatz als nur `os.path.basename()` dafür zu verwenden?

Gruß
Atalanttore

Nein. Ich meine “../../..”. Das sind immer Pfade. In der Funktion ist mir zu oft unquote aufgerufen, mach das EINMAL am Anfang. Und da man weiß, das URLs den / zur Trennung der Komponenten nutzen, würde ich auch das als simples split Argument benutzten. Da posixpath zu nutzen nur weil das zufällig den gleichen Trenner hat ist nicht wirklich besser als das os.path Modul.

@__deets__: Ich habe den Code nach deinen Empfehlungen, sofern ich alles richtig verstanden habe, und noch ein wenig mehr umgebaut. Auf reguläre Ausdrücke habe ich verzichtet.

Code:

Code: Alles auswählen

import fnmatch
import urllib.parse

URL = "https://apod.nasa.gov/image/1906/gendlerM83-New-HST-ESO-LL.jpg"


def get_basename(url, file_extension):
    """Return basename corresponding to url."""

    url_path = urllib.parse.urlsplit(url).path
    unquoted_url= urllib.parse.unquote(url_path)
    basename = unquoted_url.split("/")[-1]

    if not fnmatch.fnmatch(basename, f"*.{file_extension}"):
        raise ValueError

    return basename


print(get_basename(URL, 'jpg'))

Gruß
Atalanttore

Das deutsche Python-Forum

Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren

Re: Download-Skript von Python 2 auf Python 3 portieren