Download-Skript von Python 2 auf Python 3 portieren

Programmierung für GNOME und GTK+, GUI-Erstellung mit Glade.
Benutzeravatar
__blackjack__
User
Beiträge: 3524
Registriert: Samstag 2. Juni 2018, 10:21

Sonntag 7. Juli 2019, 19:41

@Atalanttore: Die BeautifulSoup-Methoden mit den „unpythonischen“ Namen sollten man nicht mehr verwenden. Du hast ja auch `find_all()` statt `findAll()` genommen. Bei `findChild()` sollte man `find()` nehmen.

`os.path.join()` ist für Pfade, nicht für URLs! Für URLs ist `urllib.parse.urljoin()` die passende Funktion.
A train station is where trains stop.
A bus station is where busses stop.
A Work Station is where …
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Sonntag 7. Juli 2019, 20:34

@__blackjack__: Danke für die Hinweise.

Ich habe den Code zum Extrahieren der Bild-URL aus der HTML-Datei nun ins Download.Skript eingebaut.

Momentan kommt es (beim Logging) zu einem `ValueError`, weil an die Methode `get_image_info()` kein HTML-Code zum Parsen übergeben wird.

Fehlermeldung:

Code: Alles auswählen

/usr/bin/python3.6 /home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py
2019-07-07 21:19:36,857 __main__: Starting
2019-07-07 21:19:36,857 __main__: Attempting to determine the current resolution.
2019-07-07 21:19:36,957 __main__: Using detected resolution of 3840x1080
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.6/logging/__init__.py", line 994, in emit
    msg = self.format(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 840, in format
    return fmt.format(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 577, in format
    record.message = record.getMessage()
  File "/usr/lib/python3.6/logging/__init__.py", line 338, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 387, in <module>
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 134, in get_user_download_directory
    logger.info("Using automatically detected path:", new_path)
Message: 'Using automatically detected path:'
Arguments: ('/home/ata/Downloads/nasa-apod-backgrounds',)
2019-07-07 21:19:36,962 __main__: Downloading contents of the site to find the image name
--- Logging error ---
Traceback (most recent call last):
  File "/usr/lib/python3.6/logging/__init__.py", line 994, in emit
    msg = self.format(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 840, in format
    return fmt.format(record)
  File "/usr/lib/python3.6/logging/__init__.py", line 577, in format
    record.message = record.getMessage()
  File "/usr/lib/python3.6/logging/__init__.py", line 338, in getMessage
    msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 394, in <module>
    site_contents = download_site(NASA_APOD_SITE)
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 148, in download_site
    logger.info("Response", response.read())
Message: 'Response'
Arguments: (b'<!doctype html>\n<html>\n<head>\n<title>Astronomy Picture of the Day\n</title> \n<!-- gsfc meta tags -->\n<meta name="orgcode" content="661">\n<meta name="rno" content="phillip.a.newman">\n<meta name="content-owner" content="Jerry.T.Bonnell.1">\n<meta name="webmaster" content="Stephen.F.Fantasia.1">\n<meta name="description" content="A different astronomy and space science\nrelated image is featured each day, along with a brief explanation.">\n<!-- -->\n<meta name="keywords" content="Saturn, rings, shadow">\n<!-- -->\n<script language="javascript" id="_fed_an_ua_tag"\nsrc="//dap.digitalgov.gov/Universal-Federated-Analytics-Min.js?agency=NASA">\n</script>\n\n</head>\n\n<body BGCOLOR="#F4F4FF" text="#000000" link="#0000FF" vlink="#7F0F9F"\nalink="#FF0000">\n\n<center>\n<h1> Astronomy Picture of the Day </h1>\n<p>\n\n<a href="archivepix.html">Discover the cosmos!</a>\nEach day a different image or photograph of our fascinating universe is\nfeatured, along with a brief explanation written by a professional astronomer.\n<p>\n\n2019 July 7 \n<br> \n<a href="image/1907/CrescentSaturn_cassini_4824.jpg">\n<IMG SRC="image/1907/CrescentSaturn_cassini_1080.jpg"\nalt="See Explanation.  Clicking on the picture will download\n the highest resolution version available." style="max-width:100%"></a>\n</center>\n\n<center>\n<b> Crescent Saturn </b> <br> \n<b> Image Credit: </b> \n<a href="https://www.nasa.gov/">NASA</a>, \n<a href="https://www.esa.int/">ESA</a>, \n<a href="https://www.spacescience.org/">SSI</a>,\n<a href="http://ciclops.org/ir_index_main/Cassini">Cassini Imaging Team</a>\n</center> <p> \n\n<b> Explanation: </b> \nSaturn never shows a crescent phase -- from Earth.  \n\nBut when viewed from beyond, the \n<a href="https://solarsystem.nasa.gov/planets/saturn/overview/">majestic \ngiant planet</a> can show an unfamiliar diminutive sliver.\n\nThis <a href="https://photojournal.jpl.nasa.gov/catalog/PIA08388"\n>image of crescent Saturn</a> in natural color was taken by the robotic \n<a href="https://solarsystem.nasa.gov/missions/cassini/overview/"\n>Cassini spacecraft</a> in 2007.\n\nThe featured image captures \n<a href="https://en.wikipedia.org/wiki/Rings_of_Saturn">Saturn\'s\nmajestic rings</a> from the side of the ring plane opposite\nthe Sun -- the <a href="ap121222.html">unilluminated side</a> -- another\nvista not visible from Earth.\n\nPictured are many of \n<a href="https://en.wikipedia.org/wiki/Saturn">Saturn</a>\'s photogenic wonders, including the \n<a href="ap060503.html">subtle colors</a> of \n<a href="ap041102.html">cloud bands</a>, the complex \nshadows of the rings on the planet, and \nthe <a href="ap040721.html">shadow of the planet</a>\non the rings.\n\nA careful eye will find the moons \n<a href="ap170111.html">Mimas</a> (2 o\'clock) and \n<a href="ap061107.html">Janus</a> (4 o\'clock), \nbut the real challenge is to find \n<a href="ap051123.html">Pandora</a> (8 o\'clock). \n\nSaturn is now nearly \n<a href="https://in-the-sky.org/news.php?id=20190709_12_100"\n>opposite from the Sun</a> in the Earth\'s sky and so \n<a href="ap180614.html">can be seen</a> \nin the evening starting just after sunset for the rest of the night.\n\n\n<p> <center> \n<b> Tomorrow\'s picture: </b>galactic center in radio\n\n<p> <hr>\n<a href="ap190706.html">&lt;</a>\n| <a href="archivepix.html">Archive</a>\n| <a href="lib/apsubmit2015.html">Submissions</a> \n| <a href="lib/aptree.html">Index</a>\n| <a href="https://antwrp.gsfc.nasa.gov/cgi-bin/apod/apod_search">Search</a>\n| <a href="calendar/allyears.html">Calendar</a>\n| <a href="/apod.rss">RSS</a>\n| <a href="lib/edlinks.html">Education</a>\n| <a href="lib/about_apod.html">About APOD</a>\n| <a href=\n"http://asterisk.apod.com/discuss_apod.php?date=190707">Discuss</a>\n| <a href="ap190708.html">&gt;</a>\n\n<hr><p>\n<b> Authors & editors: </b>\n<a href="http://www.phy.mtu.edu/faculty/Nemiroff.html">Robert Nemiroff</a>\n(<a href="http://www.phy.mtu.edu/">MTU</a>) &\n<a href="https://antwrp.gsfc.nasa.gov/htmltest/jbonnell/www/bonnell.html"\n>Jerry Bonnell</a> (<a href="http://www.astro.umd.edu/">UMCP</a>)<br>\n<b>NASA Official: </b> Phillip Newman\n<a href="lib/about_apod.html#srapply">Specific rights apply</a>.<br>\n<a href="https://www.nasa.gov/about/highlights/HP_Privacy.html">NASA Web\nPrivacy Policy and Important Notices</a><br>\n<b>A service of:</b>\n<a href="https://astrophysics.gsfc.nasa.gov/">ASD</a> at\n<a href="https://www.nasa.gov/">NASA</a> /\n<a href="https://www.nasa.gov/centers/goddard/">GSFC</a>\n<br><b>&</b> <a href="http://www.mtu.edu/">Michigan Tech. U.</a><br>\n</center>\n</body>\n</html>\n\n',)
2019-07-07 21:19:37,606 __main__: Grabbing the image URL
2019-07-07 21:19:37,608 __main__: Opening remote URL
Traceback (most recent call last):
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 400, in <module>
    filename = get_image(site_contents)
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 159, in get_image
    file_url, filename, file_size = get_image_info('a href', text)
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 237, in get_image_info
    remote_file = urllib.request.urlopen(file_url)
  File "/usr/lib/python3.6/urllib/request.py", line 223, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.6/urllib/request.py", line 511, in open
    req = Request(fullurl, data)
  File "/usr/lib/python3.6/urllib/request.py", line 329, in __init__
    self.full_url = url
  File "/usr/lib/python3.6/urllib/request.py", line 355, in full_url
    self._parse()
  File "/usr/lib/python3.6/urllib/request.py", line 384, in _parse
    raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: ''

Process finished with exit code 1

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
SEED_IMAGES = 10
SHOW_DEBUG = False

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():
    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info("Using automatically detected path:", new_path)
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        logger.info("Response", response.read())
        reply = response.read().decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = f"Error: {error.code})"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info('a href', text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info('img src', text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(element, source):
    # Grabs information about the image
    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')
    file_url = str()

    for tag in tags:
        if tag.find("img"):
            file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))
        else:
            logger.warning("Could not find an image. May be a video today.")
            return None, None, None

    # Create our handle for our remote file
    logger.info("Opening remote URL")

    remote_file = urllib.request.urlopen(file_url)

    filename = os.path.basename(file_url)
    file_size = float(remote_file.headers.get("content-length"))

    return file_url, filename, file_size


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info("Resizing the image from", image.size[0], "x", image.size[1], "to", DEFAULT_RESOLUTION_X, "x", DEFAULT_RESOLUTION_Y)
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')
        #file_handler.close()


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < SEED_IMAGES:
        # Let's seed some images
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading some seed images as well")
        days_back = 0
        seed_images_left = SEED_IMAGES
        while seed_images_left > 0:
            days_back += 1
            logger.info(f"Downloading seed image ({seed_images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            seed_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            seed_site_contents = download_site(seed_filename)

            # Make sure we didn't encounter an error for some reason
            if seed_site_contents == "error":
                logger.error("Seed site contains an error")
                continue

            seed_filename = get_image(seed_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if seed_filename is None:
                continue

            resize_image(seed_filename)

            # Add this to our list of images
            images.append(seed_filename)
            seed_images_left -= 1
        logger.info("Done downloading seed images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")
Gruß
Atalanttore
Benutzeravatar
__blackjack__
User
Beiträge: 3524
Registriert: Samstag 2. Juni 2018, 10:21

Sonntag 7. Juli 2019, 21:55

@Atalanttore: Zuerst kommen da zwei `TypeError`\s von Logging-Aufrufen bei denen zusätzliche Argumente übergeben werden, für die aber keine Platzhalter im ersten Argument vorhanden sind. Entweder formatierst Du die Werte wie an anderer Stelle auch vor dem Logging-Aufruf in das erste Argument, oder Du gibst einen Platzhalter an.

`get_image_info()` enthält komischen Code. `file_url` wird mit einer leeren Zeichenkette initialisiert — das hat zwar den richtigen Datentyp, kann aber niemals ein sinnvoller, gültiger Wert für eine URL sein. Warum wird das da so gemacht? Der Wert wird dann später tatsächlich verwendet wenn kein <a>-Element im HTML vorhanden ist. Den Fall sollte man sinnvoller behandeln. Ich bin fast sicher, dass das genau der Fall ist in den der Code hier rein läuft und die Ausnahme auslöst.

In der Schleife über die <a>-Tags wird die Funktion verlassen und (None, None, None) zurückgegeben sobald auch nur *ein* <a>-Tag im HTML vorhanden ist, der kein <img>-Element enthält. Das sieht ziemlich falsch aus. Und falls alle <a>-Elemente ein <img>-Element enthalten, dann wird die URL vom letzten <a>-Tag verwendet das gefunden wird.

Dann wird auch wieder etwas aus `os.path` mit einer URL verwendet. Das funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind, und es fällt auch auf Systemen wo Pfade und URLs sich ähneln auf die Nase wenn die URL noch einen „query“ und/oder „fragment“ Anteil besitzt.
A train station is where trains stop.
A bus station is where busses stop.
A Work Station is where …
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Montag 8. Juli 2019, 20:14

@__blackjack__: Der `logger` unterstützt also kein Konkatenieren von Strings mittels Komma (wie bei der `print()`-Funktion).

Ist der Code in `get_image_info()` jetzt weniger komisch, obwohl der Code nach wie vor nicht so funktioniert wie gwünscht?

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
SEED_IMAGES = 10
SHOW_DEBUG = False

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():
    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info(f"Using automatically detected path: {new_path}")
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        logger.info(f"Response: {response.read()}")
        reply = response.read().decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = f"Error: {error.code})"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info('a href', text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info('img src', text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(element, source):
    # Grabs information about the image
    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')

    if tags:
        for tag in tags:
            if tag.find("img"):
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))

                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                filename = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = float(remote_file.headers.get("content-length"))

                return file_url, filename, file_size

    else:
        logger.warning("Could not find an image. May be a video today.")
        return None, None, None


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info("Resizing the image from", image.size[0], "x", image.size[1], "to", DEFAULT_RESOLUTION_X, "x", DEFAULT_RESOLUTION_Y)
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')
        #file_handler.close()


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < SEED_IMAGES:
        # Let's seed some images
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading some seed images as well")
        days_back = 0
        seed_images_left = SEED_IMAGES
        while seed_images_left > 0:
            days_back += 1
            logger.info(f"Downloading seed image ({seed_images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            seed_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            seed_site_contents = download_site(seed_filename)

            # Make sure we didn't encounter an error for some reason
            if seed_site_contents == "error":
                logger.error("Seed site contains an error")
                continue

            seed_filename = get_image(seed_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if seed_filename is None:
                continue

            resize_image(seed_filename)

            # Add this to our list of images
            images.append(seed_filename)
            seed_images_left -= 1
        logger.info("Done downloading seed images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")
Gruß
Atalanttore
Benutzeravatar
__blackjack__
User
Beiträge: 3524
Registriert: Samstag 2. Juni 2018, 10:21

Montag 8. Juli 2019, 22:31

@Atalanttore: „Konkatenieren per Komma“ klingt irgendwie so als würde das Komma da irgendetwas besonderes bedeuten. Das hat bei `print()` die gleiche Bedeutung wie bei den `Logger`-Methoden, wie bei allen anderen Funktionen und Methoden: Argumente voneinander trennen. `print()` konkateniert die Argumente auch nicht, sondern gibt die einfach nur der Reihe nach aus, mit einem Leerzeichen dazwischen, beziehungsweise was auch immer als `sep`-Schlüsselwortargument übergeben wurde.

Die Logger-Methoden nehmen auch beliebig viele Positionsargumente entgegen und formatieren die in das erste Argument hinein *falls* eine Ausgabe stattfinden soll. Das ist für Fälle gedacht, wo die Umwandlung eines Arguments in eine Zeichenkette relativ ”teuer” ist, so dass sie nur passieren muss, wenn die Nachricht überhaupt protokolliert werden soll.

`get_image_info()` sieht jetzt sinnvoller aus.

Edit: Diesmal ist mir `download_site()` aufgefallen das im Fehlerfall einen speziellen Fehlerwert liefert, der aber vom gleichen Typ ist wie ein gültiges Ergebnis. Und zwar ist der Fehlerwert die Zeichekette f"Error: {error.code})". Da wo die Funktion aufgerufen wird, wird dann aber auf Gleichheit mit 'error' geprüft. Es wird in der Funktion eine Ausnahme durch einen speziellen Fehlerwert ersetzt, denn den alle Aufrufer explizit prüfen müssen. Genau um solche fragilen Fehlerbehandlungen loszuwerden wurden Ausnahmen erfunden.
A train station is where trains stop.
A bus station is where busses stop.
A Work Station is where …
Sirius3
User
Beiträge: 9999
Registriert: Sonntag 21. Oktober 2012, 17:20

Dienstag 9. Juli 2019, 07:04

`get_image_info` sieht immer noch falsch aus. Ähnlich wie urljoin gibt es auch ein urlsplit oder urlparse um eine URL wieder auseinander zu nehmen. Ein `return` tief verschachtelt in einer for-Schleife ist schwierig zu lesen.
Das Problem ist aber, dass falls kein a-Tag mit einem img-Tag gefunden wird, None zurückgeliefert wird statt (None, None, None). Das ist schlecht, weil unerwartet und den Fall prüfst Du aber beim Aufrufen auch nicht ab.
Warum ist file_size ein Float? Willst Du auch halbe Bytes verarbeiten können?
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Dienstag 9. Juli 2019, 20:42

@__blackjack__: Danke für die Erklärungen. Sollte die Funktion `download_site()` eine Exception (vielleicht einen `ConnectionError`) zurückgeben, wenn keine Seite heruntergeladen werden konnte?


@Sirius3: Danke für die Erklärungen. Die Funktion `get_image_info()` habe ich weiter angepasst.
Für `urlsplit` oder `urlparse` habe ich noch keinen Beispielcode gefunden, wie man einfach und ohne verschachtelte reguläre Ausdrücke an den Dateinamen kommt. Wie würdest du es machen?


Aktuelle Version der Funktion `get_image_info()` [der restliche Code hat sich nicht geändert]:

Code: Alles auswählen

def get_image_info(element, source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')
    print("Tags:", tags)  # Liste ist immer leer :(

    if tags:
        for tag in tags:
            if tag.find("img"):
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))

                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size
Gruß
Atalanttore
Sirius3
User
Beiträge: 9999
Registriert: Sonntag 21. Oktober 2012, 17:20

Mittwoch 10. Juli 2019, 07:02

@Atalanttore: jetzt hast Du noch das Problem, dass im Falle dass kein a-Tag ein img-Tag enthält keine Warnung ausgegeben wird.
Der Parameter `element` wird gar nicht benutzt.

Auch an anderen Stellen benutzt Du Rückgabewerte (None) wo es besser wäre Exceptions zu benutzen. `exit` sollte in einem sauberen Programm gar nicht vorkommen, weil bei man für solche Funktionen gar keine Fehlerbehandlung machen kann.
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Samstag 13. Juli 2019, 15:28

@Sirius3: Danke für die Vorschläge.

Durch das Auskommentieren eines Loggers, der `response.read()` in der Funktion `download_site()` aufruft, gibt die Funktion nun auch den heruntergeladenen HTML-Quellcode zurück. Warum ist das so?

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
SEED_IMAGES = 10
SHOW_DEBUG = False

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():
    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info(f"Using automatically detected path: {new_path}")
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        #logger.info(f"Response: {response.read()}")
        reply = response.read().decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = "error"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info(text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info(text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')
    print("Tags:", tags)  # Liste ist immer leer :(

    if tags:
        for tag in tags:
            if tag.find("img"):
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))

                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info("Resizing the image from", image.size[0], "x", image.size[1], "to", DEFAULT_RESOLUTION_X, "x", DEFAULT_RESOLUTION_Y)
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < SEED_IMAGES:
        # Let's seed some images
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading some seed images as well")
        days_back = 0
        seed_images_left = SEED_IMAGES
        while seed_images_left > 0:
            days_back += 1
            logger.info(f"Downloading seed image ({seed_images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            seed_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            seed_site_contents = download_site(seed_filename)

            # Make sure we didn't encounter an error for some reason
            if seed_site_contents == "error":
                logger.error("Seed site contains an error")
                continue

            seed_filename = get_image(seed_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if seed_filename is None:
                continue

            resize_image(seed_filename)

            # Add this to our list of images
            images.append(seed_filename)
            seed_images_left -= 1
        logger.info("Done downloading seed images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")
Gruß
Atalanttore
Benutzeravatar
__blackjack__
User
Beiträge: 3524
Registriert: Samstag 2. Juni 2018, 10:21

Samstag 13. Juli 2019, 15:33

@Atalanttore: Weil `response.read()` die gesamte Antwort liest. Die ist dann ”weg”, wie das bei Dateien so üblich ist.
A train station is where trains stop.
A bus station is where busses stop.
A Work Station is where …
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Samstag 13. Juli 2019, 17:13

@__blackjack__: Danke, dass wusste ich noch nicht. Mit dieser Info bin ich bei der Programmausführung nun wieder ein Stück weiter gekommen.

Es erscheinen nun folgende Fehlermeldungen nachdem insgesamt 4 Bilder heruntergeladen wurden:

Code: Alles auswählen

2019-07-13 17:57:29,137 __main__: Done downloading images
Traceback (most recent call last):
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 384, in <module>
    filename = create_desktop_background_scroll(filename)
  File "/home/ata/PycharmProjects/nasa-apod-desktop/nasa_apod_desktop.py", line 353, in create_desktop_background_scroll
    xml_tree.write(filename, pretty_print=True)
  File "src/lxml/etree.pyx", line 2039, in lxml.etree._ElementTree.write
  File "src/lxml/serializer.pxi", line 721, in lxml.etree._tofilelike
  File "src/lxml/serializer.pxi", line 780, in lxml.etree._create_output_buffer
  File "src/lxml/serializer.pxi", line 770, in lxml.etree._create_output_buffer
PermissionError: [Errno 13] Permission denied
  1. Warum scheitert es an einer fehlenden Berechtigung?


    Im Code wird an mehreren Stellen der Wert einer Konstante mit dem Rückgabewert einer Funktion ersetzt.
    Z.B.:

    Code: Alles auswählen

    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()
  2. Was ist davon zu halten?

Aktueller Code:

Code: Alles auswählen

from gi.repository import GLib
from bs4 import BeautifulSoup

import logging
import subprocess
import urllib.request, urllib.parse, urllib.error
import re
import os
import random
import glob
from PIL import Image
from sys import stdout
from sys import exit
from lxml import etree
from datetime import datetime, timedelta

NASA_APOD_SITE = 'http://apod.nasa.gov/apod/'
TEMPORARY_DOWNLOAD_PATH = '/tmp/backgrounds/'
CUSTOM_FOLDER = 'nasa-apod-backgrounds'

RESOLUTION_TYPE = 'stretch'
DEFAULT_RESOLUTION_X = 1024
DEFAULT_RESOLUTION_Y = 768

IMAGE_SCROLL = True
IMAGE_DURATION = 1200
COUNT_IMAGES_FROM_PREVIOUS_DAYS = 3

LOG_LEVEL = logging.DEBUG
LOG_FORMAT = '%(asctime)s %(name)s: %(message)s'

logger = logging.getLogger(__name__)
logger.setLevel(LOG_LEVEL)

formatter = logging.Formatter(LOG_FORMAT)

stream_handler = logging.StreamHandler()
stream_handler.setFormatter(formatter)

logger.addHandler(stream_handler)


# Use XRandR to grab the desktop resolution. If the scaling method is set to 'largest',
# we will attempt to grab it from the largest connected device. If the scaling method
# is set to 'stretch' we will grab it from the current value. Default will simply use
# what was set for the default resolutions.
def find_display_resolution():  # TODO: Überhaupt notwendig?

    if RESOLUTION_TYPE == 'default':
        logger.info(f"Using default resolution of {DEFAULT_RESOLUTION_X}x{DEFAULT_RESOLUTION_Y}")
        return DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y

    resolution_x = 0
    resolution_y = 0

    logger.info("Attempting to determine the current resolution.")
    if RESOLUTION_TYPE == 'largest':
        regex_search = 'connected'
    else:
        regex_search = 'current'

    p1 = subprocess.Popen(["xrandr"], stdout=subprocess.PIPE)
    p2 = subprocess.Popen(["grep", regex_search], stdin=p1.stdout, stdout=subprocess.PIPE)  # TODO: Pythons re-Modul verwenden

    p3 = re.findall(regex_search, str(p1.communicate()[0]))
    p1.stdout.close()
    output = str(p2.communicate()[0])

    if RESOLUTION_TYPE == 'largest':
        # We are going to go through the connected devices and get the X/Y from the largest
        matches = re.finditer(" connected ([0-9]+)x([0-9]+)+", output)  # TODO: liefert einen Iterator, der immer „wahr" ist.
        if matches:
            largest = 0
            for match in matches:
                if int(match.group(1)) * int(match.group(2)) > largest:
                    resolution_x = match.group(1)
                    resolution_y = match.group(2)
        else:
            logger.warning("Could not determine largest screen resolution.")

    else:
        reg = re.search(".* current (.*?) x (.*?),.*", output)
        if reg:
            resolution_x = reg.group(1)
            resolution_y = reg.group(2)
        else:
            logger.warning("Could not determine current screen resolution.")

    # If we couldn't find anything automatically use what was set for the defaults
    if resolution_x == 0 or resolution_y == 0:
        resolution_x = DEFAULT_RESOLUTION_X
        resolution_y = DEFAULT_RESOLUTION_Y
        logger.warning("Could not determine resolution automatically. Using defaults.")

    logger.info(f"Using detected resolution of {resolution_x}x{resolution_y}")

    return int(resolution_x), int(resolution_y)


# Uses GLib to find the localized "Downloads" folder
# See: http://askubuntu.com/questions/137896/how-to-get-the-user-downloads-folder-location-with-python
def get_user_download_directory():
    downloads_dir = GLib.get_user_special_dir(GLib.USER_DIRECTORY_DOWNLOAD)

    if downloads_dir:
        # Add any custom folder
        new_path = os.path.join(downloads_dir, CUSTOM_FOLDER)
        logger.info(f"Using automatically detected path: {new_path}")
    else:
        new_path = TEMPORARY_DOWNLOAD_PATH
        logger.warning("Could not determine download folder with GLib. Using default.")
    return new_path


# Download HTML of the site
def download_site(url):
    logger.info("Downloading contents of the site to find the image name")
    opener = urllib.request.build_opener()
    req = urllib.request.Request(url)
    try:
        response = opener.open(req)
        site = response.read()
        logger.info(f"Response: {site}")
        reply = site.decode()
    except urllib.error.HTTPError as error:
        logger.error(f"Error downloading {url} - {error.code}")
        reply = "error"
    return reply


# Finds the image URL and saves it
def get_image(text):
    logger.info("Grabbing the image URL")
    file_url, filename, file_size = get_image_info(text)
    # If file_url is None, the today's picture might be a video
    if file_url is None:
        return None  # TODO: Exception benutzen

    logger.info(f"Found name of image: {filename}")

    save_to = os.path.join(TEMPORARY_DOWNLOAD_PATH, os.path.splitext(filename)[0] + '.png')

    if not os.path.isfile(save_to):
        # If the response body is less than 500 bytes, something went wrong
        if file_size < 500:
            logger.warning("Response less than 500 bytes, probably an error\nAttempting to just grab image source")
            file_url, filename, file_size = get_image_info(text)
            # If file_url is None, the today's picture might be a video
            if file_url is None:
                return None  # TODO: Exception benutzen
            logger.info(f"Found name of image: {filename}")
            if file_size < 500:
                # Give up
                logger.error("Could not find image to download")
                exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

            logger.info("Retrieving image")
            urllib.request.urlretrieve(file_url, save_to, print_download_status)

            # Adding additional padding to ensure entire line 
            logger.info(f"\rDone downloading {human_readable_size(file_size)}       ")
        else:
            urllib.request.urlretrieve(file_url, save_to)
    else:
        logger.info("File exists, moving on")

    return save_to


def get_image_info(source):
    # Grabs information about the image
    file_url = None
    file_name = None
    file_size = None

    soup = BeautifulSoup(str(source), 'lxml')
    tags = soup.find_all('a')

    if tags:
        for tag in tags:
            if tag.find("img"):  # TODO: Warnung ausgegeben, wenn kein a-Tag ein img-Tag enthält
                file_url = urllib.parse.urljoin(NASA_APOD_SITE, tag.get('href'))
                # Create our handle for our remote file
                logger.info("Opening remote URL")

                remote_file = urllib.request.urlopen(file_url)

                file_name = os.path.basename(file_url)  # TODO: Funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind

                file_size = int(remote_file.headers.get("content-length"))

    else:
        logger.warning("Could not find an image. May be a video today.")

    return file_url, file_name, file_size


# Resizes the image to the provided dimensions
def resize_image(filename):
    logger.info("Opening local image")

    image = Image.open(filename)
    current_x, current_y = image.size
    if (current_x, current_y) == (DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y):
        logger.info("Images are currently equal in size. No need to scale.")
    else:
        logger.info(f"Resizing the image from {image.size[0]} x {image.size[1]} to {DEFAULT_RESOLUTION_X} x {DEFAULT_RESOLUTION_Y}")
        image = image.resize((DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y), Image.ANTIALIAS)

        logger.info(f"Saving the image as {filename}")

        with open(filename, 'wb'):
            image.save(filename, 'PNG')


# Sets the new image as the wallpaper
def set_gnome_wallpaper(file_path):
    logger.info("Setting the wallpaper")
    command = "gsettings set org.gnome.desktop.background picture-uri file://" + file_path
    status, output = subprocess.getstatusoutput(command)  # TODO: Statt subprocess.getstatusoutput etwas wie subprocess.run benutzen
    return status


def print_download_status(block_count, block_size, total_size):
    written_size = human_readable_size(block_count * block_size)
    total_size = human_readable_size(total_size)

    # Adding space padding at the end to ensure we overwrite the whole line
    stdout.write(f"\r{written_size} bytes of {total_size}         ")
    stdout.flush()


def human_readable_size(number_bytes):  # TODO: gibt bei Größen größer 1073741824 None zurück.
    for x in ['bytes', 'KB', 'MB']:
        if number_bytes < 1024.0:
            return "%3.2f%s" % (number_bytes, x)
        number_bytes /= 1024.0


# Creates the necessary XML so background images will scroll through
def create_desktop_background_scroll(filename):
    if not IMAGE_SCROLL:
        return filename

    logger.info("Creating XML file for desktop background switching.")

    filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, '/nasa_apod_desktop_backgrounds.xml')

    # Create our base, background element
    background = etree.Element("background")

    # Grab our PNGs we have downloaded
    images = glob.glob(TEMPORARY_DOWNLOAD_PATH + "/*.png")
    num_images = len(images)

    if num_images < COUNT_IMAGES_FROM_PREVIOUS_DAYS:
        # Start with yesterday and continue going back until we have enough
        logger.info("Downloading images of previous days as well")

        days_back = 0
        images_left = COUNT_IMAGES_FROM_PREVIOUS_DAYS

        while images_left > 0:
            days_back += 1
            logger.info(f"Downloading image ({images_left} left):")
            day_to_try = datetime.now() - timedelta(days=days_back)

            # Filenames look like /apYYMMDD.html
            archive_filename = os.path.join(NASA_APOD_SITE, "ap" + day_to_try.strftime("%y%m%d") + ".html")
            archive_site_contents = download_site(archive_filename)

            # Make sure we didn't encounter an error for some reason
            if archive_site_contents == "error":
                logger.error("Archive site contains an error")
                continue

            archive_filename = get_image(archive_site_contents)
            # If the content was an video or some other error occurred, skip the
            # rest.
            if archive_filename is None:
                continue

            resize_image(archive_filename)

            # Add this to our list of images
            images.append(archive_filename)
            images_left -= 1
        logger.info("Done downloading images")

    # Get our images in a random order so we get a new order every time we get a new file
    random.shuffle(images)
    # Recalculate the number of pictures
    num_images = len(images)

    for i, image in enumerate(images):
        # Create a static entry for keeping this image here for IMAGE_DURATION
        static = etree.SubElement(background, "static")

        # Length of time the background stays
        duration = etree.SubElement(static, "duration")
        duration.text = str(IMAGE_DURATION)

        # Assign the name of the file for our static entry
        static_file = etree.SubElement(static, "file")
        static_file.text = images[i]

        # Create a transition for the animation with a from and to
        transition = etree.SubElement(background, "transition")

        # Length of time for the switch animation
        transition_duration = etree.SubElement(transition, "duration")
        transition_duration.text = "5"

        # We are always transitioning from the current file
        transition_from = etree.SubElement(transition, "from")
        transition_from.text = images[i]

        # Create our tranition to element
        transition_to = etree.SubElement(transition, "to")

        # Check to see if we're at the end, if we are use the first image as the image to
        if i + 1 == num_images:
            transition_to.text = images[0]
        else:
            transition_to.text = images[i + 1]

    xml_tree = etree.ElementTree(background)
    xml_tree.write(filename, pretty_print=True)

    return filename


if __name__ == '__main__':
    logger.info("Starting")

    # Find desktop resolution
    DEFAULT_RESOLUTION_X, DEFAULT_RESOLUTION_Y = find_display_resolution()

    # Set a localized download folder
    TEMPORARY_DOWNLOAD_PATH = get_user_download_directory()

    # Create the download path if it doesn't exist
    if not os.path.exists(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH)):
        os.makedirs(os.path.expanduser(TEMPORARY_DOWNLOAD_PATH))

    # Grab the HTML contents of the file
    site_contents = download_site(NASA_APOD_SITE)
    if site_contents == "error":
        logger.error("Could not contact site.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Download the image
    filename = get_image(site_contents)
    if filename is not None:
        # Resize the image
        resize_image(filename)

    # Create the desktop switching xml
    filename = create_desktop_background_scroll(filename)
    # If the script was unable todays image and IMAGE_SCROLL is set to False,
    # the script exits
    if filename is None:
        logger.error("Today's image could not be downloaded.")
        exit()  # TODO: `exit` sollte in einem sauberen Programm gar nicht vorkommen

    # Set the wallpaper
    status = set_gnome_wallpaper(filename)
    logger.info("Finished!")
Gruß
Atalanttore
Benutzeravatar
__blackjack__
User
Beiträge: 3524
Registriert: Samstag 2. Juni 2018, 10:21

Samstag 13. Juli 2019, 17:54

@Atalanttore: Lass Dir doch mal ausgeben was da versucht wird wohin zu speichern.
A train station is where trains stop.
A bus station is where busses stop.
A Work Station is where …
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Samstag 13. Juli 2019, 18:27

@__blackjack__: Danke für den Tipp. Nach `/nasa_apod_desktop_backgrounds.xml` mit Benutzerrechten speichern funktioniert natürlich nicht.

Vor dem Dateinamen habe ich nun den Schrägstrich entfernt und die Zeile mit dem zu `path_with_filename` geänderten Bezeichner sieht nun so aus.

Code: Alles auswählen

path_with_filename = os.path.join(TEMPORARY_DOWNLOAD_PATH, 'nasa_apod_desktop_backgrounds.xml')
Nun hat sich erstmalig das Desktop-Hintergrundbild geändert und es erscheint jetzt ein heruntergeladenes Astronomisches Bild des Tages. Der zeitgesteuerte Wechsel der Bilder funktioniert allerdings noch nicht. Die Auflösung des Bildes ist auch nicht ideal und das Seitenverhältnis wurde nicht beibehalten.

Ist es überhaupt notwendig, die Größe der heruntergeladenen Bilder mit der Funktion `resize_image()` an die Bildschirmauflösung anzupassen?

Gruß
Atalanttore
Atalanttore
User
Beiträge: 338
Registriert: Freitag 6. August 2010, 17:03

Dienstag 16. Juli 2019, 19:11

__blackjack__ hat geschrieben:
Sonntag 7. Juli 2019, 21:55
Dann wird auch wieder etwas aus `os.path` mit einer URL verwendet. Das funktioniert an sich schon nicht garantiert, weil Pfade etwas anderes als URLs sind, und es fällt auch auf Systemen wo Pfade und URLs sich ähneln auf die Nase wenn die URL noch einen „query“ und/oder „fragment“ Anteil besitzt.
  1. Was ist der Unterschied zwischen Pfade und URLs (auf einem Linux-System)?
  2. `os.path.basename(file_url)` funktioniert zwar, aber soll nicht immer funktionieren.
    Wie extrahiert man am besten (pythonischten) den Dateiname aus einer URL?
Gruß
Atalanttore
__deets__
User
Beiträge: 5782
Registriert: Mittwoch 14. Oktober 2015, 14:29

Dienstag 16. Juli 2019, 19:16

Mit https://docs.python.org/3/library/urlli ... llib.parse - und URLs können Hosts, Ports, Usernamen, Passwörter, Protokolle, Parameter enthalten. Hast du sowas schon mal bei Dateien gesehen?
Antworten