Commit Graph

555 Commits

Author SHA1 Message Date
Markus Heiser fe6bac5a08 [fix] pip install -e: legacy editable install (setup.py develop) is deprecated
From [1]: There is now a standardized mechanism [2] for an installer like pip to
request an editable install of a project.  pip is transitioning to using this
standard only instead of invoking the deprecated `setup.py develop` command.

For backward compatibility, we can use switches:

--use-pep517
  https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-use-pep517

--no-build-isolation
  https://pip.pypa.io/en/stable/cli/pip_install/#cmdoption-no-build-isolation

- [1] https://github.com/pypa/pip/issues/11457
- [2] https://peps.python.org/pep-0660/

Closes: https://github.com/searxng/searxng/issues/3701
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-08-21 12:09:14 +02:00
Bnyro 9bbcd37138 [docs] engine_overview.rst: add length and views parameter to videos template 2024-07-27 11:49:58 +02:00
Bnyro 84abab0808 [feat] engine: implementation of geizhals.de 2024-07-27 11:46:25 +02:00
Markus Heiser 2039060b64 [mod] revision of the settings_loader
The intention of this PR is to modernize the settings_loader implementations.
The concept is old (remember, this is partly from 2014), back then we only had
one config file, meanwhile we have had a folder with config files for a very
long time.  Callers can now load a YAML configuration from this folder as
follows ::

    settings_loader.get_yaml_cfg('my-config.yml')

- BTW this is a fix of #3557.

- Further the `existing_filename_or_none` construct dates back to times when
  there was not yet a `pathlib.Path` in all Python versions we supported in the
  past.

- Typehints have been added wherever appropriate

At the same time, this patch should also be downward compatible and not
introduce a new environment variable. The localization of the folder with the
configurations is further based on:

    SEARXNG_SETTINGS_PATH (wich defaults to /etc/searxng/settings.yml)

Which means, the default config folder is `/etc/searxng/`.

ATTENTION: intended functional changes!

 If SEARXNG_SETTINGS_PATH was set and pointed to a not existing file, the
 previous implementation silently loaded the default configuration.  This
 behavior has been changed: if the file or folder does not exist, an
 EnvironmentError exception will be thrown in future.

Closes: https://github.com/searxng/searxng/issues/3557
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 18:10:06 +02:00
Bnyro e4da22ee51 [feat] engine: implementation of alpine linux packages
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 17:57:58 +02:00
holysoles 7be468d213 [feat] docker: add env vars for common public instance settings 2024-06-14 14:58:02 +02:00
Ember cb945276b6 Change 'his/her' to 'them' 2024-06-13 10:23:01 +02:00
Markus Heiser 845a0b678d [doc] add 'hostnames' plugin to the online documentation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-07 14:42:52 +02:00
Bnyro 3bec04079c [feat] hostname replace plugin: possibility to prioritize certain websites
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-07 14:42:52 +02:00
allendema_searxng_pi ee146dbc07 [enh] Add engine for discourse forums 2024-06-07 10:16:09 +02:00
Daniel Kukula cc8b537e34 [mod] package.html template: additional links (a python dict)
- Closes: https://github.com/searxng/searxng/issues/3456
2024-05-15 12:50:35 +02:00
Bnyro 82b6c0d05f [feat] engine: implementation of gitea 2024-05-15 07:23:57 +02:00
Markus Heiser 742303d030 [mod] improve unit converter plugin
- l10n support: parse and format decimal numbers by babel
- ability to add additional units
- improved unit detection (symbols are not unique)
- support for alias units (0,010C to F --> 32,018 °F)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-05-09 17:16:31 +02:00
Markus Heiser 542f7d0d7b [mod] pylint all files with one profile / drop PYLINT_SEARXNG_DISABLE_OPTION
In the past, some files were tested with the standard profile, others with a
profile in which most of the messages were switched off ... some files were not
checked at all.

- ``PYLINT_SEARXNG_DISABLE_OPTION`` has been abolished
- the distinction ``# lint: pylint`` is no longer necessary
- the pylint tasks have been reduced from three to two

  1. ./searx/engines -> lint engines with additional builtins
  2. ./searx ./searxng_extra ./tests -> lint all other python files

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-11 14:55:38 +01:00
Markus Heiser 707d6270c8 [doc] engine: mullvad leta
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-03-10 18:20:07 +01:00
Bnyro f3b4bf86a7 [feat] engine: implementation of void linux packages
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-29 13:12:40 +01:00
Bnyro e76ab1a4b3 [refactor] images: add resolution, image format and filesize fields
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-25 16:22:37 +01:00
Bnyro 60d8414ef1 [docs] engine_overview: add packages result template 2024-02-25 14:56:57 +01:00
Alexandre Flament ed66ed758d [mod] reduce memory footprint by not calling babel.Locale.parse at runtime
babel.Locale.parse loads more than 60MB in RAM.  The only purpose is to get:

    LOCALE_NAMES   - searx.data.LOCALES["LOCALE_NAMES"]
    RTL_LOCALES    - searx.data.LOCALES["RTL_LOCALES"]

This commit calls babel.Locale.parse when the translations are update from
weblate and stored in::

    searx/data/locales.json

This file can be build by::

    ./manage data.locales

By store these variables in searx.data when the translations are updated we save
round about 65MB (usually 4 worker = 260MB of RAM saved.

Suggested-by: https://github.com/searxng/searxng/discussions/2633#discussioncomment-8490494
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-20 10:43:20 +01:00
Markus Heiser c197c0e35e [fix] remove twine from requirements-dev
SearXNG is a rolling release / we do not deploy packages on PyPi

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-02-16 18:29:40 +01:00
Markus Heiser ab8e5383fb [mod] remove X-XSS-Protection headers
Deprecated header not used by browsers nowadays[1]:

"""In modern browsers, X-XSS-Protection has been deprecated in favor of the
Content-Security-Policy to disable the use of inline JavaScript. Its use can
introduce XSS vulnerabilities in otherwise safe websites. This should not be
used unless you need to support older web browsers that don’t yet support CSP.
It is thus recommended to set the header as X-XSS-Protection: 0."""[2]

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/X-XSS-Protection
[2] https://infosec.mozilla.org/guidelines/web_security#x-xss-protection

Closes: https://github.com/searxng/searxng/issues/3171
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-01-31 17:23:41 +01:00
Markus Heiser e560d7e373 [mod] presearch: add language & region support
In Presearch there are languages for the UI and regions for narrowing down the
search.  With this change the SearXNG engine supports a search by region.  The
details can be found in the documentation of the source code.

To test, you can search terms like::

   !presearch bmw :zh-TW
   !presearch bmw :en-CA

1. You should get results corresponding to the region (Taiwan, Canada)
2. and in the language (Chinese, Englisch).
3. The context in info box content is in the same language.

Exceptions:

1. Region or language is not supported by Presearch or
2. SearXNG user did not selected a region tag, example::

    !presearch bmw :en

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-01-15 19:23:26 +01:00
Markus Heiser f9c5727ddc [mod] get rid of ./utils/brand.env and its workflow
All the environments defined in ./utils/brand.env are generated on the fly, so
there is no longer a need to define the brand environment in this file and all
the workflows to handle this file.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-01-09 16:31:19 +01:00
Bnyro 3dea7e609b [feat] autocompleter: implementation of stract (beta) 2024-01-07 11:18:16 +01:00
Bnyro bf75a8c2a0 [feat] engine: implementation of bpb 2023-11-27 16:46:41 +01:00
Markus Heiser 76b91a3ef6 [dev] manage runtime versions with asdf
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-12 20:54:57 +01:00
Alexandre Flament bd3f526859
Docker: add UWSGI_WORKERS and UWSGI_THREAD environment variables (#2992)
* Docker: add UWSGI_WORKERS and UWSGI_THREAD.

UWSGI_WORKERS specifies the number of process.
UWSGI_THREADS specifies the number of threads.

The Docker convention is to specify the whole configuration
through environment variables. While not done in SearXNG, these two
additional variables allows admins to skip uwsgi.ini

In additional, https://github.com/searxng/preview-environments starts Docker
without additional files through searxng-helm-chat.
Each instance consumes 1Go of RAM which is a lot especially when there are a
lot of instances / pull requests.

* [scripts] add environments UWSGI_WORKERS and UWSGI_THREADS

- UWSGI_WORKERS specifies the number of process.
- UWSGI_THREADS specifies the number of threads.

Templates for uwsgi scripts can be tested by::

    UWSGI_WORKERS=8 UWSGI_THREADS=9 \
      ./utils/searxng.sh --cmd\
      eval "echo \"$(cat utils/templates/etc/uwsgi/*/searxng.ini*)\""\
      | grep "workers\|threads"

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>

---------

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-12 16:46:34 +00:00
Markus Heiser d13a8f6453 [mod] document server:public_instance & remove it out of the botdetection
- the option server:public_instance lacks some documentation
- the processing of this option belongs in the limiter and not
  in botdetection module

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-01 06:44:56 +01:00
Markus Heiser fd814aac86 [mod] isolation of botdetection from the limiter
This patch was inspired by the discussion around PR-2882 [2].  The goals of this
patch are:

1. Convert plugin searx.plugin.limiter to normal code [1]
2. isolation of botdetection from the limiter [2]
3. searx/{tools => botdetection}/config.py and drop searx.tools
4. in URL /config, 'limiter.enabled' is true only if the limiter is really
   enabled (Redis is available).

This patch moves all the code that belongs to botdetection into namespace
searx.botdetection and code that belongs to limiter is placed in namespace
searx.limiter.

Tthe limiter used to be a plugin at some point botdetection was added, it was
not a plugin.  The modularization of these two components was long overdue.
With the clear modularization, the documentation could then also be organized
according to the architecture.

[1] https://github.com/searxng/searxng/pull/2882
[2] https://github.com/searxng/searxng/pull/2882#issuecomment-1741716891

To test:

- check the app works without the limiter, check `/config`
- check the app works with the limiter and with the token, check `/config`
- make docs.live .. and read
  - http://0.0.0.0:8000/admin/searx.limiter.html
  - http://0.0.0.0:8000/src/searx.botdetection.html#botdetection

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-11-01 06:44:56 +01:00
Alex Balgavy 5d53aac20f [mod] add hotkeys option to settings.yml
The change in the hotkey mechanism introduced in 317db5b04 does not allow
configuration via `settings.yml`.  This commit adds that functionality.

Closes: #2898
2023-10-09 18:13:00 +02:00
Aine 213cb74378 [fix] matrixrooms add proper MRS integration
Related:

- https://github.com/searxng/searxng/issues/2918
2023-10-09 13:25:13 +02:00
Bnyro 48cb58bd2e [feat] duckduckgo: support for videos and news 2023-10-09 06:53:43 +02:00
Bnyro ce270961e8 [feat] engine: implementation of mastodon 2023-10-06 10:58:23 +02:00
Markus Heiser fd1422a670 [mod] engine - simplify region & lang handling, make filters configurable
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-10-05 10:55:08 +02:00
Bnyro 26fed56d51 [mod] settings.yml: remove plugin settings for plugins that don't exist anymore 2023-09-29 11:26:49 +02:00
Markus Heiser 597c68b4aa [doc] move dosc of botdetection from developer to admin section
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-23 14:00:03 +02:00
Bnyro c999cfb422 [feat] engine: implementation of wallhaven 2023-09-21 14:25:43 +02:00
Bnyro a55e0ac553 [feat] search on category select without JS
Co-authored-by: Alexandre Flament <alex@al-f.net>
2023-09-18 21:29:11 +02:00
jazzzooo 223b3487c3 [fix] spelling 2023-09-18 16:20:27 +02:00
Markus Heiser d9dbcedeb6 [feat] implementation of qwant lite for web search
Related: https://github.com/searxng/searxng/issues/2719
Replace: https://github.com/searxng/searxng/pull/2748
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-09-17 16:53:25 +02:00
Bnyro b4e0d2eedc [feat] engine: implemenation of moviepilot (de) 2023-09-17 14:30:56 +02:00
Bnyro f182abd6f8 [mod] library of congress: fix engine 2023-09-11 19:42:31 +02:00
Hackurei 4da7003ae0 [feat] engine: implementation of odysee 2023-09-02 09:14:12 +02:00
Markus Heiser b0d2cd5ca9 [doc] add documentation of Mwmbl engine & autocompleter
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-27 17:25:26 +02:00
Bnyro df71c24b20 [mod] autocomplete.py: add support for mwmbl completions 2023-08-27 17:25:26 +02:00
Markus Heiser 1e2d11fb57 [dev-env] upgrade Sphinx-doc 7.2.3 and unpin at v7.1.2 on py3.8
- Sphinx-doc 7.2.0 drops py3.8 support [1][2]
- last version with py3.8 support is 7.1.2

Many LTS distributions still have py3.8 which EOL is in 2024-10 [3].

To continue to support a development environment on py3.8 the rigid dependency
in the development environment is unpinned in py3.8 / environment markers [4].

To get 7.2.3. work, a fix in sphinx-notfound-page is needed [5][6].

[1] https://github.com/searxng/searxng/pull/2658#issuecomment-1684867270
[2] https://github.com/sphinx-doc/sphinx/issues/11621
[3] https://devguide.python.org/versions/#supported-versions
[4] https://peps.python.org/pep-0508/#environment-markers
[5] https://github.com/readthedocs/sphinx-notfound-page/issues/219
[6] https://github.com/readthedocs/sphinx-notfound-page/issues/219#issuecomment-1694691135

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-27 17:22:45 +02:00
Markus Heiser 9100a48541 [mod] improve seekr engines and add documentation
Tis patch adds some more fields to the result items and changed paging to the
``nextResultSet`` given in seekr's JSON response.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-15 16:17:42 +02:00
ToxyFlog1627 f175574f37 [fix] typos in documentation & messages 2023-08-13 08:50:29 +02:00
Markus Heiser 905ce2a6f6 [doc] add tagesschau API to the debveloper documentation
supplement to the commit e25d1c728

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-11 12:31:03 +02:00
Markus Heiser 01c0ec5d19 [fix] typo in docs/admin/settings/settings_general.rst (doc)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-11 11:07:38 +02:00
Bnyro 224f2250ae [feat] engine: support for lemmy communities, posts, comments and users 2023-08-10 12:58:40 +02:00
Markus Heiser 460bbe5b81 [mod] implement brave (WEB) engine to replace XPath configuration
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-08 16:21:45 +02:00
Markus Heiser 64100db904 [doc] improve documentation of make targets and ./manage script
BTW force modularization of the ./mange script into sub modules:

- utils/lib_sxng_data.sh
- utils/lib_sxng_node.sh
- utils/lib_sxng_static.sh
- utils/lib_sxng_test.sh
- utils/lib_sxng_themes.sh

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-06 19:37:12 +02:00
Markus Heiser 1d0abb7157 [doc] engine bt4g: add documentation to docs/dev/engines/online/
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-06 09:30:48 +02:00
Markus Heiser db522cf76d [mod] engine: wikimedia - improve results, add addition settings & doc
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-04 19:06:50 +02:00
Markus Heiser 1b030d4b41 [doc] engine: Yacy
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-03 19:58:51 +02:00
Markus Heiser 7aa95d2d52 [doc] engine piped: add documentation to docs/dev/engines/online/
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-08-03 16:23:36 +02:00
Markus Heiser 0623d5ae76 [doc] reduce copyright remark in the footer to the SearXNG team
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-07-29 09:18:14 +02:00
Paolo Basso cada89ee36 [feat] engine: re-enables z-library (zlibrary-global.se)
- re-enables z-library as the new domain zlibrary-global.se is now available
  from the open web.   The announcement of the domain:

    https://www.reddit.com/r/zlibrary/comments/13whe08/mod_note_zlibraryglobalse_domain_is_officially/

  It is an official domain, it requires to log in to the "personal" subdomain
  only to download files, but the search works.

- changes the result template of zlibrary to paper.html, filling the appropriate fields
- implements language filtering for zlibrary
- implement zlibrary custom filters (engine traits)
- refactor and document the zlibrary engine
2023-07-07 21:36:51 +02:00
Markus Heiser 5720844fcd [doc] rearranges Settings & Engines docs for better readability
We have built up detailed documentation of the *settings* and the *engines* over
the past few years.  However, this documentation was still spread over various
chapters and was difficult to navigate in its entirety.

This patch rearranges the Settings & Engines documentation for better
readability.

To review new ordered docs::

   make docs.clean docs.live

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-07-01 22:45:19 +02:00
Markus Heiser e2df6b77a3 [mod] engine: Anna's Archive - additionl settings (content, sort, ext)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-29 09:32:57 +02:00
Paolo Basso 401561cb58 [mod] engine torznab - refactor & option to hide links
- torznab engine using types and clearer code
- torznab option to hide torrent and magnet links.
- document the torznab engine
- add myself to authors

Closes: https://github.com/searxng/searxng/issues/1124
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-28 10:03:44 +02:00
Markus Heiser e8706fb738 [fix] engine & network issues / documentation and type annotations
This patch fixes some quirks and issues related to the engines and the network.
Each engine has its own network and this network was broken for the following
engines[1]:

- archlinux
- bing
- dailymotion
- duckduckgo
- google
- peertube
- startpage
- wikipedia

Since the files have been touched anyway, the type annotaions of the engine
modules has also been completed so that error messages from the type checker are
no longer reported.

Related and (partial) fixed issue:

- [1] https://github.com/searxng/searxng/issues/762#issuecomment-1605323861
- [2] https://github.com/searxng/searxng/issues/2513
- [3] https://github.com/searxng/searxng/issues/2515

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-25 13:58:26 +02:00
Markus Heiser 825846ed4b [doc] settings.yml: add missing $SEARXNG_REDIS_URL to the docs
Closes: https://github.com/searxng/searxng/issues/2499
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-16 07:49:41 +02:00
Markus Heiser 1f0fb3122d [doc] code and sytle injection is not supported by the simple theme
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-13 11:57:40 +02:00
Markus Heiser f3763d73ad [mod] limiter: blocklist and passlist (ip_lists)
A blocklist and a passlist can be configured in /etc/searxng/limiter.toml::

    [botdetection.ip_lists]
    pass_ip = [
      '51.15.252.168',  # IPv4 of check.searx.space
    ]

    block_ip = [
      '93.184.216.34',  # IPv4 of example.org
    ]

Closes: https://github.com/searxng/searxng/issues/2127
Closes: https://github.com/searxng/searxng/pull/2129
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-06-05 14:07:19 +02:00
Markus Heiser 1ec325adcc [mod] limiter -> botdetection: modularization and documentation
In order to be able to meet the outstanding requirements, the implementation is
modularized and supplemented with documentation.

This patch does not contain functional change, except it fixes issue #2455

----

Aktivate limiter in the settings.yml and simulate a bot request by::

    curl -H 'Accept-Language: de-DE,en-US;q=0.7,en;q=0.3' \
         -H 'Accept: text/html'
         -H 'User-Agent: xyz' \
         -H 'Accept-Encoding: gzip' \
         'http://127.0.0.1:8888/search?q=foo'

In the LOG:

    DEBUG   searx.botdetection.link_token : missing ping for this request: .....

Since ``BURST_MAX_SUSPICIOUS = 2`` you can repeat the query above two time
before you get a "Too Many Requests" response.

Closes: https://github.com/searxng/searxng/issues/2455
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-29 14:54:56 +02:00
Markus Heiser 68a0e1bfc3 [doc] fix tyops in docs/dev/reST.rst
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-26 11:51:35 +02:00
Markus Heiser 0ee27ba3aa [doc] answer CAPTCHA from server's IP
Related: https://github.com/searxng/searxng/issues/2011#issuecomment-1553317619
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-22 10:46:10 +02:00
Markus Heiser c9833ded9f [doc] update & fix documentation of the "SearXNG LXC suite"
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-21 18:12:39 +02:00
Markus Heiser 007a615ffa [mod] donation_url: disable by default
SearXNG's donation campaign has been ended.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-05-15 09:19:17 +02:00
spalinger 1918fb2ea0 [fix] typos/grammar in docs 2023-04-21 06:51:44 +02:00
Markus Heiser a5dad3b7c8 [doc] slightly reorder the chapters & improve TOCs for better navigation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-16 15:21:26 +02:00
xrrxr c9d55e3559
[doc] misspelling: weight 2023-04-09 21:21:55 +08:00
Markus Heiser f117f969d8 [mod] in the preference page, show !bang of subgrouping categories
The names of the subgrouping categories in the preference page are translated,
to use this categories the user needs to know by which !bang the category can be
selected.  Related to "Make 'non tab category' bangs discoverable" in [#690].

Related:

- [#690] https://github.com/searxng/searxng/issues/690
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-08 11:10:14 +02:00
Markus Heiser 8f79dd7659 [doc] additional descriptions about categories & categories_as_tabs
Add missing documentation of PR [#634].  Related to checkbox "Document how to
categorize engines" in [#690].

Related:

- [#634] https://github.com/searxng/searxng/pull/634#issuecomment-1004757502
- [#690] https://github.com/searxng/searxng/issues/690
- https://github.com/searxng/searxng/issues/1604
- https://github.com/searxng/searxng/pull/1545

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-04-07 12:56:45 +02:00
Markus Heiser 96a2eec3b5 [mod] Archlinux Wiki: improved request API & upgrade to data_type: traits_v1
re-implementation of the Archlinux Wiki:

- fetch_traits(): fetch languages, wiki URLs and title arguments
- add content field to the result list
- add documentation

Wikis from wiki.archlinux.fr, wiki.archlinux.ro, archtr.org/wiki do no longer
exists (has been merged in the main wiki).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 057e9bc1d1 [mod] SepiaSearch: re-engineered & upgrade to data_type: traits_v1
- fetch_traits() SepiaSearch and Peertube are using identical languages.
  Replace module's dictionary `supported_languages` by `engine.traits.languages`
  (data_type: `traits_v1`).
- fixed code to pass pylint
- request(): add argument boostLanguages
- response(): is replaced by peertube's video_response() function, which adds
  metadata from channel name, host & tags

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 8a8c584fec [mod] Dailymotion: improved request API & upgrade to data_type: traits_v1
- fetch_traits(): fetch locales (and languages) from dailymotion API
- removed obsolete data-type "supported_languages"
- add documentation
- improved argument list of the HTTP request:
  - add argument: family_filter_map
  - add conditional argument: localization
    Don't add localization and country arguments if the user does select a
    language (:de, :en, ..)
- improve code quality (mainly improve readability)

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 2499899554 [mod] Google: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the Google engines including a improved language
and region handling based on the engine.traits_v1 data.

When ever possible the implementations of the Google engines try to make use of
the async REST APIs.  The get_lang_info() has been generalized to a
get_google_info() function / especially the region handling has been improved by
adding the cr parameter.

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - Google (WEB),
  - Google images,
  - Google news,
  - Google scholar and
  - Google videos

  and remove data from obsolete data type "supported_languages".

  A traits.custom type that maps region codes to *supported_domains* is fetched
  from https://www.google.com/supported_domains

searx/autocomplete.py:
  Reversed engineered autocomplete from Google WEB.  Supports Google's languages and
  subdomains.  The old API suggestqueries.google.com/complete has been replaced
  by the async REST API: https://{subdomain}/complete/search?{args}

searx/engines/google.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - always use the async REST API (formally known as 'use_mobile_ui')
  - use *supported_domains* from traits
  - improved the result list by fetching './/div[@data-content-feature]'
    and parsing the type of the various *content features* --> thumbnails are
    added

searx/engines/google_images.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - if exists, freshness_date is added to the result
  - issue 1864: result list has been improved a lot (due to the new cr parameter)

searx/engines/google_news.py
  Reverse engineering and extensive testing ..
  - fetch_traits():  Fetch languages & regions from Google properties.
    *supported_domains* is not needed but a ceid list has been added.
  - different region handling compared to Google WEB
  - fixed for various languages & regions (due to the new ceid parameter) /
    avoid CONSENT page
  - Google News do no longer support time range
  - result list has been fixed: XPath of pub_date and pub_origin

searx/engines/google_videos.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - add paging support
  - implement a async request ('asearch': 'arc' & 'async':
    'use_ac:true,_fmt:html')
  - simplified code (thanks to '_fmt:html' request)
  - issue 1359: fixed xpath of video length data

searx/engines/google_scholar.py
  - fetch_traits():  Fetch languages & regions from Google properties.
  - use *supported_domains* from traits
  - request(): include patents & citations
  - response(): fixed CAPTCHA detection (Scholar has its own CATCHA manager)
  - hardening XPath to iterate over results
  - fixed XPath of pub_type (has been change from gs_ct1 to gs_cgt2 class)
  - issue 1769 fixed: new request implementation is no longer incompatible

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c80e82a855 [mod] DuckDuckGo: reversed engineered & upgrade to data_type: traits_v1
Partial reverse engineering of the DuckDuckGo (DDG) engines including a
improved language and region handling based on the enigne.traits_v1 data.

- DDG Lite
- DDG Instant Answer API
- DDG Images
- DDG Weather

docs/src/searx.engine.duckduckgo.rst:
  Online documentation of the DDG engines (make docs.live)

searx/data/engine_traits.json
  Add data type "traits_v1" generated by the fetch_traits() functions from:

  - "duckduckgo" (WEB),
  - "duckduckgo images" and
  - "duckduckgo weather"

  and remove data from obsolete data type "supported_languages".

searx/autocomplete.py:
  Reversed engineered Autocomplete from DDG.  Supports DDG's languages.

searx/engines/duckduckgo.py:
  - fetch_traits():  Fetch languages & regions from DDG.

  - get_ddg_lang(): Get DDG's language identifier from SearXNG's locale.  DDG
    defines its languages by region codes.  DDG-Lite does not offer a language
    selection to the user, only a region can be selected by the user.

  - Cache ``vqd`` value: The vqd value depends on the query string and is needed
    for the follow up pages or the images loaded by a XMLHttpRequest (DDG
    images).  The ``vqd`` value of a search term is stored for 10min in the
    redis DB.

  - DDG Lite engine: reversed engineered request method with improved Language
    and region support and better ``vqd`` handling.

searx/engines/duckduckgo_definitions.py: DDG Instant Answer API
  The *instant answers* API does not support languages, or at least we could not
  find out how language support should work.  It seems that most of the features
  are based on English terms.

searx/engines/duckduckgo_images.py: DDG Images
  Reversed engineered request method.  Improved language and region handling
  based on cookies and the enigne.traits_v1 data.  Response: add image format to
  the result list

searx/engines/duckduckgo_weather.py: DDG Weather
  Improved language and region handling based on cookies and the
  enigne.traits_v1 data.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser e9afc4f8ce [mod] Startpage: reversed engineered & upgrade to data_type: traits_v1
One reason for the often seen CAPTCHA of the Startpage requests are the
incomplete requests SearXNG sends to startpage.com: this patch is a complete new
implementation of the ``request()`` function, reversed engineered from the
Startpage's search form.  The new implementation:

- use traits of data_type: traits_v1 and drop deprecated data_type: supported_languages
- adds time-range support
- adds save-search support
- fix searxng/searxng/issues 1884
- fix searxng/searxng/issues 1081 --> improvements to avoid CAPTCHA

In preparation for more categories (News, Images, Videos ..) from Startpage, the
variable ``startpage_categ`` was set up.  The default value is ``web`` and other
categories from Startpage are not yet implemented.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 858aa3e604 [mod] wikipedia & wikidata: upgrade to data_type: traits_v1
BTW this fix an issue in wikipedia: SearXNG's locales zh-TW and zh-HK are now
using language `zh-classical` from wikipedia (and not `zh`).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser e0a6ca96cc [doc] add a description of bing engines (web, news, video, images)
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser c9cd376186 [mod] replace searx.languages by searx.sxng_locales
With the language and region tags from the EngineTraitsMap the handling of
SearXNG's tags of languages and regions has been normalized and is no longer
a *mystery*.  The "languages" became "locales" that are supported by babel and
by this, the update_engine_traits.py can be simplified a lot.

Other code places can be simplified as well, but these simplifications
should (respectively can) only be done when none of the engines work with the
deprecated EngineTraits.supported_languages interface anymore.

This commit replaces searx.languages by searx.sxng_locales and fix the naming of
some names from "language" to "locale" (e.g. language_codes --> sxng_locales).

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 61383edb27 [mod] Startpage: fetch engine traits (data_type: supported_languages)
Implements a fetch_traits function for the Startpage engine.

.. note::

   Does not include migration of the request methode from 'supported_languages'
   to 'traits' (EngineTraits) object!

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser a7fe22770a [mod] Peertube: re-engineered & upgrade to data_type: traits_v1
- fetch_traits(): Fetch languages from peertube's search-index source code.

  [mod] Include migration of the request methode from 'supported_languages'
        to 'traits' (EngineTraits) object.
  [fix] old supported_languages_url is no longer valid since the sources
        has been moved to a different path.

- fixed code to pass pylint
- request(): complete re-implementation based on the API docs [1]
- response(): complete re-implementation, adds serveral fields missed before
- add source code documentation

[1] https://docs.joinpeertube.org/api-rest-reference.html#tag/Search/operation/searchVideos

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Markus Heiser 6e5f22e558 [mod] replace engines_languages.json by engines_traits.json
Implementations of the *traits* of the engines.

Engine's traits are fetched from the origin engine and stored in a JSON file in
the *data folder*.  Most often traits are languages and region codes and their
mapping from SearXNG's representation to the representation in the origin search
engine.

To load traits from the persistence::

    searx.enginelib.traits.EngineTraitsMap.from_data()

For new traits new properties can be added to the class::

    searx.enginelib.traits.EngineTraits

.. hint::

   Implementation is downward compatible to the deprecated *supported_languages
   method* from the vintage implementation.

   The vintage code is tagged as *deprecated* an can be removed when all engines
   has been ported to the *traits method*.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-03-24 10:37:42 +01:00
Alexandre Flament d669da81fb
Merge pull request #2027 from dalf/fix_2018
Add "auto" as a language.
2023-02-20 12:17:38 +01:00
Alexandre Flament 6748e8e2d5 Add "Auto-detected" as a language.
When the user choose "Auto-detected", the choice remains on the following queries.
The detected language is displayed.

For example "Auto-detected (en)":
* the next query language is going to be auto detected
* for the current query, the detected language is English.

This replace the autodetect_search_language plugin.
2023-02-17 15:17:36 +00:00
Markus Heiser 5820dc78ce [doc] slight improvements to the doc of the settings (base_url)
Closes: https://github.com/searxng/searxng/issues/2190

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-02-17 12:08:58 +01:00
blob42 27809c84f8 [doc] add example for enabling an engine disabled by default 2023-02-12 18:33:38 +01:00
Markus Heiser 031162be04 [doc] settings.py document search.suspended_times
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-01-29 19:26:16 +00:00
Markus Heiser feccee01c0 [doc] Add doc-strings to searx.exceptions
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2023-01-29 19:06:19 +01:00
Alexandre Flament a9d6f7532a weblate: migration to https://translate.codeberg.org/ 2023-01-21 15:45:12 +00:00
Julian 6e8c7873ee
Correct my small mistake 2023-01-04 20:07:51 +01:00
Julian 2b886ab269
Correct a small mistake 2023-01-04 13:41:41 +01:00
Alexandre Flament 9e9f57e48b
Merge pull request #1954 from dalf/fix.redis.init.2
[fix] follow up of PR-1856
2022-12-14 07:08:19 +01:00
Markus Heiser ed901ab18e [mod] improve 'Autodetect search language' plugin
- Add documentation to the plugin
- Harmonize FastText language model with SearXNG's language model

Reosurces::

    import fasttext                                    # --> +10 MB
    fasttext.load_model(str(data_dir / 'lid.176.ftz')) # --> +4MB

Suggested-by: @dalf

- To speed up and simplify the deployment use fasttext-wheel instead of fasttext
- Building numpy on the Alpine Linux of docker-images takes ages --> install
  py3-numpy from Alpines package manager (apk)
- Alpine Linux on docker-images (musl libc) do not support fasttext-wheel (gnu
  libc) --> patch Dockerfile and build from fastetxt:

     sed -i s/fasttext-wheel/fasttext/ requirements.txt

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-12-11 11:26:07 +01:00
Alexandre Flament 3050e2b6e8 [fix] documentation about update-searxng.rst 2022-12-10 10:06:54 +01:00