Dillo v3.1.1-111-gd4f56d0d
Loading...
Searching...
No Matches
url.c File Reference

Parse and normalize all URL's inside Dillo. More...

#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "url.h"
#include "hsts.h"
#include "misc.h"
#include "msg.h"
Include dependency graph for url.c:

Go to the source code of this file.

Macros

#define URL_STR_FIELD_CMP(s1, s2)    (s1) && (s2) ? strcmp(s1,s2) : !(s1) && !(s2) ? 0 : (s1) ? 1 : -1
 
#define URL_STR_FIELD_I_CMP(s1, s2)    (s1) && (s2) ? dStrAsciiCasecmp(s1,s2) : !(s1) && !(s2) ? 0 : (s1) ? 1 : -1
 

Functions

char * a_Url_str (const DilloUrl *u)
 Return the url as a string.
 
const char * a_Url_hostname (const DilloUrl *u)
 Return the hostname as a string.
 
static DilloUrlUrl_object_new (const char *uri_str)
 Create a DilloUrl object and initialize it.
 
void a_Url_free (DilloUrl *url)
 Free a DilloUrl.
 
static DstrUrl_resolve_relative (const char *RelStr, const char *BaseStr)
 Resolve the URL as RFC3986 suggests.
 
DilloUrla_Url_new (const char *url_str, const char *base_url)
 Transform (and resolve) an URL string into the respective DilloURL.
 
DilloUrla_Url_dup (const DilloUrl *ori)
 Duplicate a Url structure.
 
int a_Url_cmp (const DilloUrl *A, const DilloUrl *B)
 Compare two Url's to check if they're the same, or which one is bigger.
 
void a_Url_set_flags (DilloUrl *u, int flags)
 Set DilloUrl flags.
 
void a_Url_set_data (DilloUrl *u, Dstr **data)
 Set DilloUrl data (like POST info, etc.)
 
void a_Url_set_ismap_coords (DilloUrl *u, char *coord_str)
 Set DilloUrl ismap coordinates.
 
static int Url_decode_hex_octet (const char *s)
 Given an hex octet (e.g., e3, 2F, 20), return the corresponding character if the octet is valid, and -1 otherwise.
 
char * a_Url_decode_hex_str (const char *str)
 Parse possible hexadecimal octets in the URI path.
 
char * a_Url_encode_hex_str (const char *str)
 Urlencode 'str'.
 
char * a_Url_string_strip_delimiters (const char *str)
 RFC-3986 suggests this stripping when "importing" URLs from other media.
 
int a_Url_host_type (const char *host)
 What type of host is this?
 
static uint_t Url_host_public_internal_dots (const char *host)
 How many internal dots are in the public portion of this hostname?.
 
static const char * Url_host_find_public_suffix (const char *host)
 Given a URL host string, return the portion that is public.
 
bool_t a_Url_same_organization (const DilloUrl *u1, const DilloUrl *u2)
 

Variables

static const char * HEX = "0123456789ABCDEF"
 

Detailed Description

Parse and normalize all URL's inside Dillo.

  • <scheme> <authority> <path> <query> and <fragment> point to 'buffer'.
  • 'url_string' is built upon demand (transparent to the caller).
  • 'hostname' and 'port' are also being handled on demand.

Definition in file url.c.

Macro Definition Documentation

◆ URL_STR_FIELD_CMP

#define URL_STR_FIELD_CMP (   s1,
  s2 
)     (s1) && (s2) ? strcmp(s1,s2) : !(s1) && !(s2) ? 0 : (s1) ? 1 : -1

Definition at line 57 of file url.c.

◆ URL_STR_FIELD_I_CMP

#define URL_STR_FIELD_I_CMP (   s1,
  s2 
)     (s1) && (s2) ? dStrAsciiCasecmp(s1,s2) : !(s1) && !(s2) ? 0 : (s1) ? 1 : -1

Definition at line 59 of file url.c.

Function Documentation

◆ a_Url_cmp()

int a_Url_cmp ( const DilloUrl A,
const DilloUrl B 
)

Compare two Url's to check if they're the same, or which one is bigger.

The fields which are compared here are: <scheme>, <authority>, <path>, <query> and <data> Other fields are left for the caller to check

Return value: 0 if equal, > 0 if A > B, < 0 if A < B.

Note: this function defines a sorting order different from strcmp!

Definition at line 506 of file url.c.

References DilloUrl::authority, DilloUrl::data, dReturn_val_if_fail, dStr_cmp(), DilloUrl::path, DilloUrl::query, DilloUrl::scheme, URL_STR_FIELD_CMP, and URL_STR_FIELD_I_CMP.

Referenced by a_Bw_add_url(), a_Bw_get_url_doc(), a_Capi_conn_abort_by_url(), a_History_add_url(), a_History_get_title_by_url(), a_History_set_title_by_url(), a_Nav_cancel_expect_if_eq(), a_Nav_push(), Cache_entry_by_url_cmp(), Cache_entry_cmp(), Dicache_entry_cmp(), Html_tag_open_meta(), DilloHtml::loadImages(), and Nav_open_url().

◆ a_Url_decode_hex_str()

char * a_Url_decode_hex_str ( const char *  str)

Parse possible hexadecimal octets in the URI path.

Returns a new allocated string.

Definition at line 586 of file url.c.

References dNew, dRealloc(), dStrdup(), and Url_decode_hex_octet().

◆ a_Url_dup()

◆ a_Url_encode_hex_str()

char * a_Url_encode_hex_str ( const char *  str)

Urlencode 'str'.

-RL :: According to the RFC 1738, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters ";/?:@=&" used for their reserved purposes may be used unencoded within a URL. We'll escape everything but alphanumeric and "-_.*" (as lynx). –Jcid

Note: the content type "application/x-www-form-urlencoded" is used: i.e., ' ' -> '+' and '
' -> CR LF (see HTML 4.01, Sec. 17.13.4)

Definition at line 620 of file url.c.

References d_isascii, dIsalnum, dNew, and HEX.

Referenced by Menu_bugmeter_validate(), and UIcmd_make_search_str().

◆ a_Url_free()

◆ a_Url_host_type()

int a_Url_host_type ( const char *  host)

◆ a_Url_hostname()

const char * a_Url_hostname ( const DilloUrl u)

Return the hostname as a string.

(initializing 'hostname' and 'port' fields if necessary) Note: a similar approach can be taken for user:password auth.

Definition at line 98 of file url.c.

References DilloUrl::authority, dStrAsciiCasecmp(), dStrndup(), DilloUrl::hostname, DilloUrl::port, DilloUrl::scheme, URL_HTTP_PORT, and URL_HTTPS_PORT.

Referenced by a_Url_new().

◆ a_Url_new()

DilloUrl * a_Url_new ( const char *  url_str,
const char *  base_url 
)

Transform (and resolve) an URL string into the respective DilloURL.

If URL = "http://dillo.sf.net:8080/index.html?long#part2" then the resulting DilloURL should be:

DilloURL = {
url_string = "http://dillo.sf.net:8080/index.html?long#part2"
scheme = "http"
authority = "dillo.sf.net:8080:
path = "/index.html"
query = "long"
fragment = "part2"
hostname = "dillo.sf.net"
port = 8080
flags = URL_Get
data = Dstr * ("")
ismap_url_len = 0
}

Return NULL if URL is badly formed.

Definition at line 371 of file url.c.

References _MSG, a_Hsts_require_https(), a_Url_hostname(), DilloUrl::authority, DilloUrl::data, dFree(), dNew, dStr_free(), dStr_new(), dStrAsciiCasecmp(), dStrconcat(), FALSE, HEX, DilloPrefs::http_force_https, DilloPrefs::http_strict_transport_security, DilloUrl::illegal_chars, DilloUrl::illegal_chars_spc, DilloUrl::port, prefs, DilloUrl::scheme, Dstr::str, TRUE, URL_HTTP_PORT, URL_HTTPS_PORT, Url_object_new(), Url_resolve_relative(), and DilloUrl::url_string.

Referenced by a_Cache_init(), a_Html_url_new(), a_Http_init(), a_Prefs_init(), a_UIcmd_book(), a_UIcmd_open_file(), a_UIcmd_open_urlstr(), a_UIcmd_view_page_source(), StyleEngine::apply(), Cache_parse_header(), Cache_redirect(), makeStartUrl(), parseOption(), and CssParser::parseUrl().

◆ a_Url_same_organization()

bool_t a_Url_same_organization ( const DilloUrl u1,
const DilloUrl u2 
)

◆ a_Url_set_data()

void a_Url_set_data ( DilloUrl u,
Dstr **  data 
)

Set DilloUrl data (like POST info, etc.)

Definition at line 536 of file url.c.

References DilloUrl::data, and dStr_free().

◆ a_Url_set_flags()

void a_Url_set_flags ( DilloUrl u,
int  flags 
)

◆ a_Url_set_ismap_coords()

void a_Url_set_ismap_coords ( DilloUrl u,
char *  coord_str 
)

Set DilloUrl ismap coordinates.

(this is optimized for not hogging the CPU)

Definition at line 549 of file url.c.

References dReturn_if_fail, dStr_append(), dStr_truncate(), DilloUrl::ismap_url_len, Dstr::len, DilloUrl::query, Dstr::str, URL_STR_, and DilloUrl::url_string.

Referenced by Html_set_link_coordinates().

◆ a_Url_str()

char * a_Url_str ( const DilloUrl u)

◆ a_Url_string_strip_delimiters()

char * a_Url_string_strip_delimiters ( const char *  str)

RFC-3986 suggests this stripping when "importing" URLs from other media.

Strip: "URL:", enclosing < >, and embedded whitespace. (We also strip illegal chars: 00-1F and 7F-FF)

Definition at line 658 of file url.c.

References dStrdup().

Referenced by a_UIcmd_open_urlstr(), and makeStartUrl().

◆ Url_decode_hex_octet()

static int Url_decode_hex_octet ( const char *  s)
static

Given an hex octet (e.g., e3, 2F, 20), return the corresponding character if the octet is valid, and -1 otherwise.

Definition at line 568 of file url.c.

Referenced by a_Url_decode_hex_str().

◆ Url_host_find_public_suffix()

static const char * Url_host_find_public_suffix ( const char *  host)
static

Given a URL host string, return the portion that is public.

i.e., the domain that is in a registry outside the organization. For 'www.dillo.org', that would be 'dillo.org'.

Definition at line 762 of file url.c.

References _MSG, a_Url_host_type(), URL_HOST_NAME, and Url_host_public_internal_dots().

Referenced by a_Url_same_organization().

◆ Url_host_public_internal_dots()

static uint_t Url_host_public_internal_dots ( const char *  host)
static

How many internal dots are in the public portion of this hostname?.

e.g., for "www.dillo.org", it is one because everything under "dillo.org", as a .org domain, is part of one organization.

Of course this is only a simple and imperfect approximation of organizational boundaries.

Definition at line 711 of file url.c.

References _MSG, and dStrnAsciiCasecmp().

Referenced by Url_host_find_public_suffix().

◆ Url_object_new()

static DilloUrl * Url_object_new ( const char *  uri_str)
static

Create a DilloUrl object and initialize it.

(buffer, scheme, authority, path, query and fragment).

Definition at line 137 of file url.c.

References DilloUrl::authority, DilloUrl::buffer, dNew, dNew0, dReturn_val_if_fail, dStrstrip(), DilloUrl::flags, DilloUrl::fragment, MAX, DilloUrl::path, DilloUrl::query, DilloUrl::scheme, and URL_Get.

Referenced by a_Url_dup(), a_Url_new(), and Url_resolve_relative().

◆ Url_resolve_relative()

static Dstr * Url_resolve_relative ( const char *  RelStr,
const char *  BaseStr 
)
static

Variable Documentation

◆ HEX

const char* HEX = "0123456789ABCDEF"
static

Definition at line 54 of file url.c.

Referenced by a_Url_encode_hex_str(), a_Url_new(), and dStr_printable().