Programmieren - alles kontrollieren 4.935 Themen, 20.621 Beiträge

Mit VB Quelltext von Internetseiten speichern

bechri / 16 Antworten / Flachansicht Nickles

Hallo,
ich habe schon bei google gesucht aber nichts gefunden. Und zwar möchte ich ein kleines Programm schreiben, das einem den Quelltext von einer zuvor angegebenen Seite im Netz auf dem Computer als .txt abspeichert um danach weiterverarbeitet zu werden.

Welche Funktionen brauche ich dafür?

MfG
BeChri

MfG Chris
bei Antwort benachrichtigen
d-oli Borlander „Man könnte natürlich auch einfach auf die win32-API nutzen, oder eine...“
Optionen
BORLANDER: oder eine Bibliothek die entsprechende Funktionen beinhaltet
BECHRI: Welche Funktionen brauche ich dafür?
...


Zum Lernen, ja, selber programmieren. Produktiv eher nein, wenn es schon ein Programm gibt, dass alles kann was gefordert ist. Was nicht heissen soll, dass es keine Ausnahmen gibt ...

d-oli

D:\temp>wget --help
GNU Wget 1.5.3.1, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...

Mandatory arguments to long options are mandatory for short options too.

Startup:
-V, --version display the version of Wget and exit.
-h, --help print this help.
-b, --background go to background after startup.
-e, --execute=COMMAND execute a `.wgetrc' command.

Logging and input file:
-o, --output-file=FILE log messages to FILE.
-a, --append-output=FILE append messages to FILE.
-d, --debug print debug output.
-q, --quiet quiet (no output).
-v, --verbose be verbose (this is the default).
-nv, --non-verbose turn off verboseness, without being quiet.
-i, --input-file=FILE read URL-s from file.
-F, --force-html treat input file as HTML.

Download:
-t, --tries=NUMBER set number of retries to NUMBER (0 unlimits).
-O --output-document=FILE write documents to FILE.
-nc, --no-clobber don't clobber existing files.
-c, --continue restart getting an existing file.
--dot-style=STYLE set retrieval display style.
-N, --timestamping don't retrieve files if older than local.
-S, --server-response print server response.
--spider don't download anything.
-T, --timeout=SECONDS set the read timeout to SECONDS.
-w, --wait=SECONDS wait SECONDS between retrievals.
-Y, --proxy=on/off turn proxy on or off.
-Q, --quota=NUMBER set retrieval quota to NUMBER.

Directories:
-nd --no-directories don't create directories.
-x, --force-directories force creation of directories.
-nH, --no-host-directories don't create host directories.
-P, --directory-prefix=PREFIX save files to PREFIX/...
--cut-dirs=NUMBER ignore NUMBER remote directory components.

HTTP options:
--http-user=USER set http user to USER.
--http-passwd=PASS set http password to PASS.
-C, --cache=on/off (dis)allow server-cached data (normally allowed).
--ignore-length ignore `Content-Length' header field.
--header=STRING insert STRING among the headers.
--proxy-user=USER set USER as proxy username.
--proxy-passwd=PASS set PASS as proxy password.
-s, --save-headers save the HTTP headers to file.
-U, --user-agent=AGENT identify as AGENT instead of Wget/VERSION.

FTP options:
--retr-symlinks retrieve FTP symbolic links.
-g, --glob=on/off turn file name globbing on or off.
--passive-ftp use the "passive" transfer mode.

Recursive retrieval:
-r, --recursive recursive web-suck -- use with care!.
-l, --level=NUMBER maximum recursion depth (0 to unlimit).
--delete-after delete downloaded files.
-k, --convert-links convert non-relative links to relative.
-m, --mirror turn on options suitable for mirroring.
-nr, --dont-remove-listing don't remove `.listing' files.

Recursive accept/reject:
-A, --accept=LIST list of accepted extensions.
-R, --reject=LIST list of rejected extensions.
-D, --domains=LIST list of accepted domains.
--exclude-domains=LIST comma-separated list of rejected domains.
-L, --relative follow relative links only.
--follow-ftp follow FTP links from HTML documents.
-H, --span-hosts go to foreign hosts when recursive.
-I, --include-directories=LIST list of allowed directories.
-X, --exclude-directories=LIST list of excluded directories.
-nh, --no-host-lookup don't DNS-lookup hosts.
-np, --no-parent don't ascend to the parent directory.

Mail bug reports and suggestions to <bug-wget@gnu.org>.

Konstruktive Kritik zeichnet sich dadurch aus, dass sie höflich, nützlich und sachlich ist.
bei Antwort benachrichtigen