Commit Graph

66 Commits

Author SHA1 Message Date
Evan Lucas
a1e54d6fb7 url: fix parsing of ssh urls
Fix regression introduced in 6120472036
that broke parsing of some ssh: urls.

An example url is ssh://git@github.com:npm/npm.git

PR-URL: https://github.com/iojs/io.js/pull/299
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
2015-01-12 10:56:41 +01:00
Ben Noordhuis
94e147500c Merge remote-tracking branch 'joyent/v0.12' into v1.x
I was originally going to do this after the v0.11.15 release, but as
that release is three weeks overdue now, I decided not to wait any
longer; we don't want the delta to get too big.

Conflicts:
	lib/net.js
	test/simple/simple.status

PR-URL: https://github.com/iojs/io.js/pull/236
Reviewed-By: Bert Belder <bertbelder@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
2015-01-05 17:26:47 +01:00
CGavrila
6a03fce16e url: improve parsing speed
The url.parse() function now checks whether an escapable character is in
the URL before trying to escape it.

PR-URL: https://github.com/joyent/node/pull/8638
[trev.norris@gmail.com: Switch to use continue instead of if]
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-12-30 14:00:19 -08:00
Ben Noordhuis
aff56cd2b4 lib: micro-optimize url.resolve()
Replace the call to Array#splice() with a faster open-coded version
that creates less garbage.

Add a new benchmark to prove it.  With the change applied, it scores
about 5% higher and that is nothing to sneeze at.

PR-URL: https://github.com/iojs/io.js/pull/184
Reviewed-By: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-12-20 21:33:52 +01:00
Jonathan Johnson
c90eac7e0e url: change hostname regex to negate invalid chars
Regarding joyent/node#8520

This changes hostname validation from a whitelist regex approach
to a blacklist regex approach as described in https://url.spec.whatwg.org/#host-parsing.

url.parse misinterpreted `https://good.com+.evil.org/`
as `https://good.com/+.evil.org/`.  If we use url.parse to check the
validity of the hostname, the test passes, but in the browser the
user is redirected to the evil.org website.
2014-12-09 17:57:10 +01:00
Jonathan Johnson
6120472036 url: change hostname regex to negate invalid chars
Regarding joyent/node#8520

This changes hostname validation from a whitelist regex approach
to a blacklist regex approach as described in https://url.spec.whatwg.org/#host-parsing.

url.parse misinterpreted `https://good.com+.evil.org/`
as `https://good.com/+.evil.org/`.  If we use url.parse to check the
validity of the hostname, the test passes, but in the browser the
user is redirected to the evil.org website.
2014-12-02 17:24:18 -08:00
Yazhong Liu
ba687f6eb4 url: support path for url.format
this adds support for a "path" field that overrides
"query", "search", and "pathname" if given.

Fixes: https://github.com/joyent/node/issues/8722
PR-URL: https://github.com/joyent/node/pull/8755
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-12-02 11:32:08 -08:00
Yazhong Liu
d312b6d15c url: support path for url.format
this adds support for a "path" field that overrides
"query", "search", and "pathname" if given.

Fixes: https://github.com/joyent/node/issues/8722
PR-URL: https://github.com/joyent/node/pull/8755
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-12-02 10:55:22 -08:00
Ben Noordhuis
21130c7d6f lib: turn on strict mode
Turn on strict mode for the files in the lib/ directory.  It helps
catch bugs and can have a positive effect on performance.

PR-URL: https://github.com/node-forward/node/pull/64
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-11-22 17:23:30 +01:00
Trevor Norris
7b4a540422 src: fix jslint warning
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-10-08 01:13:43 -07:00
Evan Rutledge Borden
640ad632e3 url: fixed encoding for slash switching emulation.
Fixes: https://github.com/joyent/node/issues/8458
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
2014-10-06 19:25:25 -05:00
Gabriel Wicke
b705b73e46 url: make query() consistent
Match the behavior of the slow path by setting url.query to an empty
object when the url contains no query, but query parsing is requested.

Also add a test for this case, and update the documents to clearly
reflect this behavior.

Fixes: https://github.com/joyent/node/issues/8332
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-10-01 12:23:01 -07:00
Timothy J Fontaine
7ca5af87a0 Merge remote-tracking branch 'upstream/v0.10' into v0.12
Conflicts:
	ChangeLog
	deps/v8/src/hydrogen.cc
	lib/http.js
	lib/querystring.js
	src/node_crypto.cc
	src/node_version.h
	test/simple/test-querystring.js
2014-09-16 17:48:09 -07:00
Majid Arif Siddiqui
176f0bd3df lib: improved forEach object performance
Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-09-05 09:19:32 -07:00
Mathias Bynens
b869797a2d url: Add support for RFC 3490 separators
There is no need to split the host by hand in `url.js` – Punycode.js
takes care of it anyway. This not only simplifies the code, but also
adds support for RFC 3490 separators (i.e. not just U+002E, but U+3002,
U+FF0E, and U+FF61 as well).

Closes #6055.

Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-08-27 14:36:04 +04:00
Ezequiel Rabinovich
678ead2608 querystring: remove prepended ? from query field
Fixes an issue that caused the first querystring to be parsed prepending
a "?" in the first variable name on relative urls with no #fragment

Reviewed-by: Trevor Norris <trev.norris@gmail.com>
2014-08-12 20:38:30 -07:00
Gabriel Wicke
4b59db008c Add fast path for simple URL parsing
This patch adds a fast path for parsing of simple path-only URLs, as commonly
found in HTTP requests received by a server.

Benchmark results [ms], before / after patch:
/foo/bar              0.008956   0.000418 (fast path used)
http://example.com/   0.011426   0.011437 (normal slow path, no change)

In a simple 'ab' benchmark of a single-threaded web server, this patch
increases the request rate from around 6400 to 7400 req/s.

Reviewed-By: Fedor Indutny <fedor@indutny.com>
2014-07-31 22:56:46 +04:00
isaacs
f7ede33f09 url: treat \ the same as /
See https://code.google.com/p/chromium/issues/detail?id=25916

Parse URLs with backslashes the same as web browsers, by replacing all
backslashes with forward slashes, except those that occur after the
first # character.

Manual rebase of 9520ade

Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-05-07 14:07:00 -07:00
isaacs
9520adeb37 url: treat \ the same as /
See https://code.google.com/p/chromium/issues/detail?id=25916

Parse URLs with backslashes the same as web browsers, by replacing all
backslashes with forward slashes, except those that occur after the
first # character.
2014-04-15 15:30:43 -07:00
Yuki KAN
006d42786e lib: use triple equals
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
2014-04-02 02:12:18 -07:00
isaacs
22c68fdc1d src: Replace macros with util functions 2013-08-01 15:08:01 -07:00
Ben Noordhuis
0330bdf519 lib: macro-ify type checks
Increases the grep factor. Makes it easier to harmonize type checks
across the code base.
2013-07-24 21:49:35 +02:00
isaacs
e71d9fd834 Merge remote-tracking branch 'ry/v0.10'
Conflicts:
	doc/api/stream.markdown
	lib/tls.js
2013-07-17 18:32:23 -07:00
Shuan Wang
48a4600c56 url: Fix edge-case when protocol is non-lowercase
When using url.parse(), path and pathname usually return '/' when there
is no path available. However when you have a protocol that contains
non-lowercase letters and the input string does not have a trailing
slash, both path and pathname will be undefined.
2013-07-17 15:59:28 -07:00
isaacs
0882a75063 Merge remote-tracking branch 'ry/v0.10'
Conflicts:
	ChangeLog
	deps/uv/AUTHORS
	deps/uv/ChangeLog
	deps/uv/src/unix/linux-core.c
	deps/uv/src/version.c
	deps/uv/src/win/timer.c
	lib/url.js
	src/node_version.h
	test/simple/test-url.js
2013-06-05 13:38:38 -07:00
Ben Noordhuis
414a909d01 url: remove unused global variable 2013-06-04 11:43:42 -07:00
isaacs
5dd91b0147 url: Set href to null by default 2013-06-03 16:02:51 -07:00
isaacs
5dc51d4e21 url: Properly parse certain oddly formed urls
In cases where there are multiple @-chars in a url, Node currently
parses the hostname and auth sections differently than web browsers.

This part of the bug is serious, and should be landed in v0.10, and
also ported to v0.8, and releases made as soon as possible.

The less serious issue is that there are many other sorts of malformed
urls which Node either accepts when it should reject, or interprets
differently than web browsers.  For example, `http://a.com*foo` is
interpreted by Node like `http://a.com/*foo` when web browsers treat
this as `http://a.com%3Bfoo/`.

In general, *only* the `hostEndingChars` should be the characters that
delimit the host portion of the URL.  Most of the current `nonHostChars`
that appear in the hostname should be escaped, but some of them (such as
`;` and `%` when it does not introduce a hex pair) should raise an
error.

We need to have a broader discussion about whether it's best to throw in
these cases, and potentially break extant programs, or return an object
that has every field set to `null` so that any attempt to read the
hostname/auth/etc. will appear to be empty.
2013-06-03 15:56:16 -07:00
isaacs
881ef7cc5f url: ~ is not actually an unwise char 2013-04-12 16:27:49 -07:00
isaacs
17a379ec39 url: Escape all unwise characters
This makes node's http URL handling logic identical to Chrome's

Re #5284
2013-04-12 11:39:28 -07:00
J. Lee Coltrane
54d293da56 url: make url.format escape delimiters in path and query
`url.format` should escape ? and # chars in pathname, and # chars in
search, because they change the semantics of the operation otherwise.
Don't escape % chars, or anything else. (see: #4082)
2012-10-30 09:16:13 -07:00
Ben Noordhuis
9b61f570d8 Merge remote-tracking branch 'origin/v0.8'
Conflicts:
	configure
	deps/v8/build/common.gypi
2012-10-25 16:08:58 +02:00
Ben Noordhuis
de0303d3ad url: parse hostnames that start with - or _
Allow hostnames like '-lovemonsterz.tumblr.com' and '_jabber._tcp.google.com'.

Fixes #4177.
2012-10-25 01:06:00 +02:00
isaacs
7144be70db url: Go much faster by using Url class
V8 loves it when JavaScript pretends to be a Classic inheritance
type of language.

Before:

$ ./node benchmark/url.js
benchmarking parse() ... 1.868 sec
benchmarking format() ... 1.906 sec
benchmarking resolve("../foo/bar?baz=boom") ... 7.800 sec
benchmarking resolve("foo/bar") ... 7.099 sec
benchmarking resolve("http://nodejs.org") ... 8.403 sec
benchmarking resolve("./foo/bar?baz") ... 7.974 sec

After:

$ ./node benchmark/url.js
benchmarking parse() ... 1.769 sec
benchmarking format() ... 1.793 sec
benchmarking resolve("../foo/bar?baz=boom") ... 4.254 sec
benchmarking resolve("foo/bar") ... 3.932 sec
benchmarking resolve("http://nodejs.org") ... 4.382 sec
benchmarking resolve("./foo/bar?baz") ... 4.293 sec
2012-09-17 10:44:23 -07:00
isaacs
9fc7283a40 Fix #3270 Escape url.parse delims
Rather than omitting them.
2012-05-16 15:41:28 -07:00
Łukasz Walukiewicz
677c2c112c Ignore an empty port component when parsing URLs. 2012-03-12 12:46:56 -07:00
Ben Noordhuis
86f4846c21 url: decode url entities in auth section
Fixes #2736.
2012-02-20 17:11:21 +01:00
Łukasz Walukiewicz
a94ffdaec5 url: Support for IPv6 addresses in URLs.
Fixes #1138, #2610.
2012-01-31 16:58:41 +09:00
Jordan Sissel
358f0ce96b url: add '.' '+' and '-' in url protocol
- Based on BNF from RFC 1738 section 5.
- Added tests to cover svn+ssh and some other examples
2011-11-04 13:36:06 +01:00
seebees
216570b5e1 Lint 2011-10-22 14:14:40 +09:00
seebees
1ead20f274 remove auth from host
Fixes #1626
2011-10-22 14:14:40 +09:00
seebees
be4576de7a url.resolveObject(url.parse(x), y) == url.parse(url.resolve(x, y));
added a .path property = .pathname + .search for use with http.request

And tests to verify everything.
With the tests, I changed over to deepEqual, and I would note the comment on the test
['.//g', 'f:/a', 'f://g'], which I think is a fundamental problem

This supersedes pull 1596
2011-10-22 14:14:39 +09:00
Maciej Małecki
d0552949b9 url: add plus sign to protocol pattern 2011-09-06 17:03:52 +02:00
Robert Mustacchi
de0b8d601c jslint cleanup: path.js, readline.js, repl.js, tls.js, tty_win32.js, url.js 2011-07-29 11:58:02 -07:00
Ben Noordhuis
1b0e054737 url: throw descriptive error if url argument to parse() is not a string
Fixes #568.
2011-07-21 00:51:48 +02:00
isaacs
dcecfc5f1b Close #1360 url: Allow _ in hostnames. 2011-07-19 09:55:01 -07:00
isaacs
87900b14da url: Don't swallow punycode errors 2011-07-06 13:17:50 -07:00
Jeremy Selier
2a848fa727 Close #1149 IDNA and Punycode support in url.parse
Using @bnoordhuis's punycode lib.

Close #1174 also
2011-07-06 13:17:50 -07:00
Ryan Petrello
58a1d7ec30 Close #562 Close #1078 Parse file:// urls properly
The file:// protocol *always* has a hostname; it's frequently
abbreviated as an empty string, which represents 'localhost'
implicitly.

According to RFC 1738 (http://tools.ietf.org/html/rfc1738):

A file URL takes the form:

   file://<host>/<path>

where <host> is the fully qualified domain name of the system on
which the <path> is accessible...

As a special case, <host> can be the string "localhost" or the empty
string; this is interpreted as 'the machine from which the URL is
being interpreted'.
2011-05-27 10:47:34 -07:00
isaacs
307f39ce9e Fix a url regression
The change for #954 introduced a regression that would cause
the url parser to fail on special chars found in the auth
segment.  Fix that, and also don't create invalid urls when
format() is called on an object containing an auth member
containing '@' characters or delimiters.
2011-05-10 13:57:25 -07:00