Commit Graph

1823 Commits

Author SHA1 Message Date
Zhijie He 76f52b5b45 [feat] add Sogou WeChat article search support 2025-03-02 13:31:31 +01:00
Zhijie He 97aa5a779b [feat] add Sogou engine for searxng
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-03-02 13:31:31 +01:00
Zhijie He 71d1504e57 [feat] add 360search engine for searxng
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-03-02 13:25:35 +01:00
Bnyro a51416c7c3 [feat] engines: add openclipart.org 2025-03-01 18:01:51 +01:00
Markus Heiser d0022d86d2 [refactor] soundcloud engine
Closes: https://github.com/searxng/searxng/issues/4226
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-01 17:51:14 +01:00
Markus Heiser 1d16b94279 [fix] wikidata: increase wikidata queries timeout
The big queries for initializing and updating the currencies take longer than
the default of the wikidata engine, which is only 3sec.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-01 12:21:29 +01:00
Markus Heiser 03419078ef [fix] bing fetch engine traits - adjusted XPath selectors
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-03-01 09:43:06 +01:00
Markus Heiser 887594f634 [fix] Internet links disappeared from wikidata side box (second try)
Related:

- https://github.com/searxng/searxng/pull/4286#issuecomment-2639960013

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-26 08:45:56 +01:00
fatwang2 bc5c8e5748 [fix] engine unsplash: image links by preserving URL parameters
Only remove ixid parameter while keeping other essential URL parameters
to ensure images are properly displayed in search results.
2025-02-26 08:44:39 +01:00
Markus Heiser 2e0abc9310 [fix] various issues in the documentation
Closes: https://github.com/searxng/searxng/issues/4370
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-26 07:15:39 +01:00
Markus Heiser 4994fbb5af [fix] engines bing.images & brave.videos - fix parse data string
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-20 17:03:25 +01:00
Bnyro 0f2fc5879d [feat] startpage: support for news and images 2025-02-20 13:44:28 +01:00
Markus Heiser feb15e3878 [fix] brave.news engine: response is HTML and no longer JSON
The response from brave.com for news is no longer a JSON string.

Closes: https://github.com/searxng/searxng/issues/4352
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-20 10:08:03 +01:00
Markus Heiser 44d941c93c [fix] mojeek web engine: don't add empty fmt argument for web searches
Empty ``&fmt=`` argument triggers an automated tools detection from mojeek.

Suggested-by: @shinodark in https://github.com/searxng/searxng/issues/4307#issuecomment-2669355322
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-20 07:45:57 +01:00
bonswouar d456f3dd9f [fix] engine adobe stock videos datetime parsing
re #4310
2025-02-12 07:05:58 +01:00
Bnyro 28ead13eb9 [chore] engines: replace datetime.utcfromtimestamp with datetime.fromtimestamp 2025-02-07 17:19:00 +01:00
Markus Heiser 147bda894e [fix] Internet links disappeared from wikidata side box
Closes: https://github.com/searxng/searxng/issues/4285

Reported and tested by: Popolon
Suggested-by: @dalf
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-02-06 14:50:30 +01:00
XLion ab1e895cc0
[fix] openverse: update API and website URL (#4275) 2025-02-02 22:12:24 +01:00
Markus Heiser 906b9e7d4c [fix] hostnames plugin: AttributeError: 'NoneType' object has no attribute 'netloc'
Closes: https://github.com/searxng/searxng/issues/4245
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-28 16:28:12 +01:00
Markus Heiser 36a1ef1239 [refactor] typification of SearXNG / EngineResults
In [1] and [2] we discussed the need of a Result.results property and how we can
avoid unclear code.  This patch implements a class for the reslut-lists of
engines::

    searx.result_types.EngineResults

A simple example for the usage in engine development::

    from searx.result_types import EngineResults
    ...
    def response(resp) -> EngineResults:
        res = EngineResults()
        ...
        res.add( res.types.Answer(answer="lorem ipsum ..", url="https://example.org") )
        ...
        return res

[1] https://github.com/searxng/searxng/pull/4183#pullrequestreview-257400034
[2] https://github.com/searxng/searxng/pull/4183#issuecomment-2614301580
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-28 07:07:08 +01:00
Markus Heiser edfbf1e118 [refactor] typification of SearXNG (initial) / result items (part 1)
Typification of SearXNG
=======================

This patch introduces the typing of the results.  The why and how is described
in the documentation, please generate the documentation ..

    $ make docs.clean docs.live

and read the following articles in the "Developer documentation":

- result types --> http://0.0.0.0:8000/dev/result_types/index.html

The result types are available from the `searx.result_types` module.  The
following have been implemented so far:

- base result type: `searx.result_type.Result`
  --> http://0.0.0.0:8000/dev/result_types/base_result.html

- answer results
  --> http://0.0.0.0:8000/dev/result_types/answer.html

including the type for translations (inspired by #3925).  For all other
types (which still need to be set up in subsequent PRs), template documentation
has been created for the transition period.

Doc of the fields used in Templates
===================================

The template documentation is the basis for the typing and is the first complete
documentation of the results (needed for engine development).  It is the
"working paper" (the plan) with which further typifications can be implemented
in subsequent PRs.

- https://github.com/searxng/searxng/issues/357

Answer Templates
================

With the new (sub) types for `Answer`, the templates for the answers have also
been revised, `Translation` are now displayed with collapsible entries (inspired
by #3925).

    !en-de dog

Plugins & Answerer
==================

The implementation for `Plugin` and `Answer` has been revised, see
documentation:

- Plugin: http://0.0.0.0:8000/dev/plugins/index.html
- Answerer: http://0.0.0.0:8000/dev/answerers/index.html

With `AnswerStorage` and `AnswerStorage` to manage those items (in follow up
PRs, `ArticleStorage`, `InfoStorage` and .. will be implemented)

Autocomplete
============

The autocompletion had a bug where the results from `Answer` had not been shown
in the past.  To test activate autocompletion and try search terms for which we
have answerers

- statistics: type `min 1 2 3` .. in the completion list you should find an
  entry like `[de] min(1, 2, 3) = 1`

- random: type `random uuid` .. in the completion list, the first item is a
  random UUID

Extended Types
==============

SearXNG extends e.g. the request and response types of flask and httpx, a module
has been set up for type extensions:

- Extended Types
  --> http://0.0.0.0:8000/dev/extended_types.html

Unit-Tests
==========

The unit tests have been completely revised.  In the previous implementation,
the runtime (the global variables such as `searx.settings`) was not initialized
before each test, so the runtime environment with which a test ran was always
determined by the tests that ran before it.  This was also the reason why we
sometimes had to observe non-deterministic errors in the tests in the past:

- https://github.com/searxng/searxng/issues/2988 is one example for the Runtime
  issues, with non-deterministic behavior ..

- https://github.com/searxng/searxng/pull/3650
- https://github.com/searxng/searxng/pull/3654
- https://github.com/searxng/searxng/pull/3642#issuecomment-2226884469
- https://github.com/searxng/searxng/pull/3746#issuecomment-2300965005

Why msgspec.Struct
==================

We have already discussed typing based on e.g. `TypeDict` or `dataclass` in the past:

- https://github.com/searxng/searxng/pull/1562/files
- https://gist.github.com/dalf/972eb05e7a9bee161487132a7de244d2
- https://github.com/searxng/searxng/pull/1412/files
- https://github.com/searxng/searxng/pull/1356

In my opinion, TypeDict is unsuitable because the objects are still dictionaries
and not instances of classes / the `dataclass` are classes but ...

The `msgspec.Struct` combine the advantages of typing, runtime behaviour and
also offer the option of (fast) serializing (incl. type check) the objects.

Currently not possible but conceivable with `msgspec`: Outsourcing the engines
into separate processes, what possibilities this opens up in the future is left
to the imagination!

Internally, we have already defined that it is desirable to decouple the
development of the engines from the development of the SearXNG core / The
serialization of the `Result` objects is a prerequisite for this.

HINT: The threads listed above were the template for this PR, even though the
implementation here is based on msgspec.  They should also be an inspiration for
the following PRs of typification, as the models and implementations can provide
a good direction.

Why just one commit?
====================

I tried to create several (thematically separated) commits, but gave up at some
point ... there are too many things to tackle at once / The comprehensibility of
the commits would not be improved by a thematic separation. On the contrary, we
would have to make multiple changes at the same places and the goal of a change
would be vaguely recognizable in the fog of the commits.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-28 07:07:08 +01:00
Bnyro 9079d0cac0 [refactor] translation engines: common interface 2025-01-28 07:07:08 +01:00
Bnyro f766faca3f [feat] engines: add ipernity (images) 2025-01-20 17:22:32 +01:00
Markus Heiser e581921c92 [fix] engine brave: remove date from the content string
Related: https://github.com/searxng/searxng/issues/4211#issuecomment-2601941440
Closes: https://github.com/searxng/searxng/issues/4006

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-20 16:40:36 +01:00
Bnyro 2f087a3a22 [feat] public domain image archive: automatically obtain algolia api key 2025-01-20 12:56:15 +01:00
Denperidge 3333d9f385 [feat] engines: public domain image archive 2025-01-20 12:56:15 +01:00
Popolon 1a885b70ce
[feat] wikidata: add mastodon, peertube and Lemmy accounts to infobox
Co-authored-by: Popolon <popolon@popolon.org>
Co-authored-by: Bnyro <bnyro@tutanota.com>
2025-01-20 11:19:56 +01:00
DanielMowitz 272e39893d [feat]: engines: add astrophysical data system 2025-01-16 20:27:55 +01:00
Lucki 35c80268bf [json_engine] Fix R0912 (too-many-branches) 2025-01-14 14:07:35 +01:00
Lucki 64d954b350 [json_engine] mirror xpath functionality 2025-01-14 14:07:35 +01:00
Lucki 591d9c2505 [json_engine] document existing options 2025-01-14 14:07:35 +01:00
Bnyro 0642c5434a [fix] dockerhub: switch to new api path
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2025-01-06 13:46:13 +01:00
Lucki 18c3e08837 Fix usage of `api_key` engine setting
The value of `params['api_key']` isn't read anywhere.
Writing directly into the header object solves this quite easily though.

> [Users can authenticate by including their API key either in a request URL by appending `?apikey=<API KEY>`, or by including the `X-API-Key: <API KEY>` header with the request.](https://wallhaven.cc/help/api)
2025-01-06 12:25:33 +01:00
Markus Heiser af3f272b0b [fix] update_engine_traits.py: annas archive, bing-* and zlibrary engines
Github action Update data - update_engine_traits [1] had issues in annas
archive, bing-* and zlibrary engines:

    ./manage pyenv.cmd python ./searxng_extra/update/update_engine_traits.py

[1] https://github.com/searxng/searxng/actions/runs/12530827768/job/34953392587

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-12-29 10:12:45 +01:00
Markus Heiser 26097f444b [fix] engine google_video: google changed the layout of the HTML response
Closes: https://github.com/searxng/searxng/issues/4127
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-12-22 11:45:46 +01:00
Markus Heiser 540323a4b0 [mod] hardening xpath engine: ignore empty results
A SearXNG maintainer on Matrix reported a traceback::

    File "searxng-src/searx/engines/xpath.py", line 272, in response
      dom = html.fromstring(resp.text)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "searx-pyenv/lib/python3.11/site-packages/lxml/html/__init__.py", line 850, in fromstring
      doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "searx-pyenv/lib/python3.11/site-packages/lxml/html/__init__.py", line 738, in document_fromstring
      raise etree.ParserError(
    lxml.etree.ParserError: Document is empty

I don't have an example to reproduce the issue, but the issue and this patch are
clearly recognizable even without an example.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-29 14:20:31 +01:00
Bnyro 0ca2520115 [feat] json/xpath engine: config option for method and body 2024-11-28 09:53:21 +01:00
Markus Heiser 7b6b772e34 [fix] wikicommons engine: remove HTML tags from result items
BTW: humanize filesize (Bytes) to KB, MB, GB ..

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-28 06:05:45 +01:00
Markus Heiser 342d321196 [fix] google engine: remove <script> tags from result items
In some results, Google returns a <script> tag that must be removed before
extracting the content.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-27 13:49:45 +01:00
Austin-Olacsi 55481a6377 [fix] findthatmeme engine URLs have changed 2024-11-27 11:08:23 +01:00
Markus Heiser 78f5300830 [chore] drop sjp engine: WEB side has changed a long time ago
The WEB page (PL only) has changed and there is now also a kind of CAPTCHA.
There is currently no possibility to restore the function of this engine.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-26 15:45:02 +01:00
Markus Heiser ac0c6cc2d1 [chore] remove invalid base_url from settings.yml engines
The engines do not have / do not need a property `base_url`, lets remove it from
the settings.yml

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-26 10:06:07 +01:00
Markus Heiser 36a6f9c95f [fix] engine: Library of Congress - image & thumb links
The properties `item.service_medium` and `item.thumb_gallery` are not given for
every result item.  It is more reliable to use the first (thumb) and
last (image) URL in the list of of URLs in `image_url`.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-26 09:36:59 +01:00
Bnyro 66f6495a22
[fix] duckduckgo extra: crashes and returns no results 2024-11-25 17:00:52 +01:00
Bnyro f31a3a2053 [chore] *: fix typos detected by typos-cli 2024-11-24 12:41:57 +01:00
Markus Heiser 0253c10b52 [feat] engine: add adobe stock video and audio engines
The engine has been revised; there is now the option ``adobe_content_types``
with which it is possible to configure engines for video and audio from the
adobe stock.  BTW this patch adds documentation to the engine.

To test all three engines in one use a search term like::

    !asi !asv !asa sound

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-24 11:56:12 +01:00
Bnyro f20a7632f1 [feat] engine: add adobe stock photos 2024-11-24 11:56:12 +01:00
Markus Heiser 0f9694c90b [clean] Internet Archive Scholar search API no longer exists
Engine was added in #2733 but the API does no longer exists. Related:

- https://github.com/searxng/searxng/issues/4038

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-23 17:59:38 +01:00
Markus Heiser c4b874e9b0 [fix] engine Library of Congress: fix API URL loc.gov -> www.loc.gov
Avoid HTTP 404 and redirects. Requests to the JSON/YAML API use the base url [1]

    https://www.loc.gov/{endpoint}/?fo=json

[1] https://www.loc.gov/apis/json-and-yaml/requests/

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-23 13:02:24 +01:00
Markus Heiser 10d3af84b8 [fix] engine: duckduckgo - don't quote query string
The query string send to DDG must not be qouted.

The query string was URL-qouted in #4011, but the URL-qouted query string result
in unexpected *URL decoded* and other garbish results as reported in #4019
and #4020.  To test compare the results of a query like::

    !ddg Häuser und Straßen :de
    !ddg Häuser und Straßen :all
    !ddg 房屋和街道 :all
    !ddg 房屋和街道 :zh

Closed:

- [#4019] https://github.com/searxng/searxng/issues/4019
- [#4020] https://github.com/searxng/searxng/issues/4020

Related:

- [#4011] https://github.com/searxng/searxng/pull/4011

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-11-17 18:14:22 +01:00
Nicolas Dato abd9b271bc [fix] engine: duckduckgo - only uses first word of the search terms
during the revision in PR #3955 the query string was accidentally converted into
a list of words, further the query must be quoted before POSTed in the ``data``
field, see ``urllib.parse.quote_plus`` [1]

[1] https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote_plus

Closed: #4009
Co-Authored-by: @return42
2024-11-14 09:33:54 +01:00
Bnyro b07c0ae39f [fix] annas archive: crash when no thumbnail, differing results, paging 2024-11-01 12:49:33 +01:00
uply23333 fa108c140f [fix] google: display every result when keyword is contained in content field 2024-10-31 13:21:32 +01:00
Markus Heiser b183e620d8 [refactor] engine: duckduckgo - https://html.duckduckgo.com/html
The entire source code of the duckduckgo engine has been reengineered and
purified.

1. DDG used the URL https://html.duckduckgo.com/html for no-JS requests whose
   response is also easier to parse than the previous
   https://lite.duckduckgo.com/lite/ URL

2. the bot detection of DDG has so far caused problems and often led to a
   CAPTCHA, this can be circumvented using `'Sec-Fetch-Mode'] = “navigate”`

Closes: https://github.com/searxng/searxng/issues/3927
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-29 14:56:27 +01:00
Markus Heiser 050451347b [fix] engine: duckduckgo - CAPTCHA detection
The previous implementation could not distinguish a CAPTCHA response from an
ordinary result list.  In the previous implementation a CAPTCHA was taken as a
result list where no items are in.

DDG does not block IPs.  Instead, a CAPTCHA wall is placed in front of request
on a dubious request.

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-19 14:55:44 +02:00
dependabot[bot] 2986681b31 [upd] pypi: Bump pylint from 3.2.7 to 3.3.1
Bumps [pylint](https://github.com/pylint-dev/pylint) from 3.2.7 to 3.3.1.
- [Release notes](https://github.com/pylint-dev/pylint/releases)
- [Commits](https://github.com/pylint-dev/pylint/compare/v3.2.7...v3.3.1)

---
updated-dependencies:
- dependency-name: pylint
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-10-15 14:29:10 +02:00
Bnyro 9f48d5f84f [feat] engine: support for openlibrary 2024-10-15 13:06:00 +02:00
0xhtml 8b6a3f3e11 [enh] engine: mojeek - add language support
Improve region and language detection / all locale

Testing has shown the following behaviour for the different
default and empty values of Mojeeks parameters:

| param    | idx | value  | behaviour                 |
| -------- | --- | ------ | ------------------------- |
| region   |  0  | ''     | detect region based on IP |
| region   |  1  | 'none' | all regions               |
| language |  0  | ''     | all languages             |
2024-10-15 06:37:01 +02:00
Snoweuph 5b6f40414a [mod] engine gitea: compatible with modern gitea or forgejo
Without this patch the Gitea Search Engine is only partially compatible with
modern gitea or forgejo:

- Fixing some JSON Fields
- Using Repository Avatar when Available

To Verify My results you can look at the Modern API doc and results, its
available on all Gitea and Forgejo instance by Default.  Heres an Search API
result of Mine:

- https://git.euph.dev/api/v1/repos/search?q=ccna
2024-10-14 14:39:11 +02:00
Markus Heiser e7a4d7d7c3 [doc] slightly improve documentation of SQL engines
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-10-03 13:04:06 +02:00
Grant Lanham 2a29e16d25 [feat] implement mariadb engine 2024-10-03 13:04:06 +02:00
Austin-Olacsi cbf1e90979 add get_embeded_stream_url to searx.utils 2024-10-03 07:10:53 +02:00
0xhtml 0a0fb450b5 [enh] engine: stract - add language/region support 2024-09-29 14:29:22 +02:00
Grant Lanham 6a3375be37 [fix] use get accessor to pull desc from bing_images 2024-09-26 07:26:51 +02:00
Zhijie He 6be56aee11 add Cloudflare AI Gateway engine
add Cloudflare AI Gateway engine

add settings for Cloudflare AI Gateway engine

set utf8 encode for data, fix non english char cause 500 error

format json data

fixed indentation and config format error

fix line-length limitation in CI

reformatted code for CI

reformatted code for CI

limit system prompts to less 120 chars

cleanup unused variable & format code
2024-09-23 07:02:10 +02:00
Grant Lanham 0b832f19bf [fix] Removes ``/>`` ending tags for void HTML elements
Removes ``/>`` ending tags for void elements [1] and replaces them with ``>``.
Part of the larger cleanup to cleanup invalid HTML throughout the codebase [2].

[1] https://html.spec.whatwg.org/multipage/syntax.html#void-elements
[2] https://github.com/searxng/searxng/issues/3793
2024-09-15 15:19:51 +02:00
Markus d3a795c7e7 [fix] engine: qwant - detect captchaUrl and raise SearxEngineCaptchaException
So far a CAPTCHA was not recognized in the response of the qwant engine and a
SearxEngineAPIException was raised by mistake.  With this patch a CAPTCHA
redirect is recognized and the correct SearxEngineCaptchaException is raised.

Closes: https://github.com/searxng/searxng/issues/3806
Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 14:45:23 +02:00
Markus cdb4927b8b [fix] fetch_traits: brave, google, annas_archive & radio_browser
This patch fixes a bug reported by CI "Fetch traits" [1] (brave) and improves
other fetch traits functions (google, annas_archive & radio_browser).

brave:

    File "/home/runner/work/searxng/searxng/searx/engines/brave.py", line 434, in fetch_traits
      sxng_tag = region_tag(babel.Locale.parse(ui_lang, sep='-'))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "/home/runner/work/searxng/searxng/searx/locales.py", line 155, in region_tag
    Error:     raise ValueError('%s missed a territory')

google:

  change ERROR message about unknow UI language to INFO message

radio_browser:

  country_list contains duplicates that differ only in upper/lower case

annas_archive:

  for better diff; sort the persistence of the traits

[1] https://github.com/searxng/searxng/actions/runs/10606312371/job/29433352518#step:6:41

Signed-off-by: Markus <markus@venom.fritz.box>
2024-09-15 12:48:35 +02:00
Bnyro 84e2f9d46a [feat] gitlab: implement dedicated module
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-15 08:04:21 +02:00
Lucas Schwiderski f05566d925 [fix] json_engine: Fix result fields being mixed up
Fixes #3810.
2024-09-12 10:47:08 +02:00
0xhtml c45870dd71 [fix] yep engine: remove links to other engines
Yep includes links to search for the same query on Google and other
search engines as a result in the search result. This fix skips these
results.
2024-09-12 00:04:04 +02:00
Markus Heiser 9eda4044be [fix] bilibili engine - ValueError in duration & HTML in title
- ValueError in duration: issue reported in #3799
- HTML in title: related to #3770

[#3799] https://github.com/searxng/searxng/issues/3799
[#3770] https://github.com/searxng/searxng/pull/3770

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-06 07:13:47 +02:00
Markus 21bfb4996e [fix] engine yahoo: HTML tags are included in result titles
- https://github.com/searxng/searxng/issues/3790

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-09-03 22:26:59 +02:00
Alexander Sulfrian 6a7b1a1a57 [fix] Do not show DDG user-agent from zero click
We do not want to show the user-agent information from the duckduckgo
zero click info. This is the user-agent used by searxng and not the
user-agent used by the user.

This was already done for the IP address in:
0fb3f0e4ae
2024-08-30 09:02:37 +02:00
Austin-Olacsi e45b771ffa [feat] engine: implementation of yandex (web, images)
It's set to inactive in settings.yml because of CAPTCHA.  You need to remove
that from the settings.yml to get in use.

Closes: https://github.com/searxng/searxng/issues/961
2024-08-21 12:08:35 +02:00
Grant Lanham 5276219b9d Fix tineye engine url, datetime parsing, and minor refactor
Changes made to tineye engine:
1. Importing logging if TYPE_CHECKING is enabled
2. Remove unecessary try-catch around json parsing the response, as this
masked the original error and had no immediate benefit
3. Improve error handling explicitely for status code 422 and 400
upfront, deferring json_parsing only for these status codes and
successful status codes
4. Unit test all new applicable changes to ensure compatability
2024-08-21 08:41:53 +02:00
0xhtml 0cfed94b08 [fix] engine google: use extract_text everywhere 2024-08-08 09:59:45 +02:00
0xhtml 7f9ce3b96e [fix] engine google: strip bubble text from answers
Google underlines words inside of answers that can be clicked to show
additional definitions. These definitions inside the answer were not
correctly handled and ended up in the middle of the answer text. With
this fix, the extra definitions are stripped from the answer shown by
the frontend.
2024-08-08 09:59:45 +02:00
Markus Heiser edfd0e2fe5 [fix] brave fetch_traits: Brave added Chinese (zh-hant) to UI
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-29 10:28:53 +02:00
Markus Heiser ee959ed9fc [fix] engine geizhals: if there are no offers, there is no best price
Fault pattern: if there are no offers, then an exception has been thrown:

    IndexError: list index out of range

This patch makes the addition of “best price” dependent on whether one exists.

Closes: #3685
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-28 19:00:51 +02:00
Bnyro 304ddd8114 [feat] videos template: support for view count 2024-07-27 11:49:58 +02:00
Bnyro 84abab0808 [feat] engine: implementation of geizhals.de 2024-07-27 11:46:25 +02:00
Sylvain Cau b9ddd59c5b [enh] Add API Key support for discourse.org forums 2024-07-27 09:21:40 +02:00
Markus Heiser 657dcb973a [fix] engine yacy: update list of base URLs
https://search.lomig.me
  Poor results / tested `!yacy :en hello` and got zero results

https://yacy.ecosys.eu
  Slow response (> 6sec for trivial search terms)

https://search.webproject.link
  Dead instance / URL offline

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-20 09:59:43 +02:00
Grant Lanham 9a4fa7cc4f Update mullvad_leta.py to account for img_elem
A recent update from Mullvad Leta introduced the img_elem. This update
broke the existing logic. Now, by checking the length of the dom_result
to see if it was included in the return results, we can handle the logic
accordingly.
2024-07-15 06:58:39 +02:00
Bnyro e4da22ee51 [feat] engine: implementation of alpine linux packages
Co-authored-by: Markus Heiser <markus.heiser@darmarit.de>
2024-07-14 17:57:58 +02:00
Grant Lanham ef103ba80a Implement google/brave switch in Mullvad Leta
cleanup

Import annontations
2024-07-07 08:08:11 +02:00
Bnyro 4eaa0dd275 [fix] gentoo: use mediawiki engine 2024-07-03 10:24:03 +02:00
Thomas Renard 39aaac40d6 [mod] libretranslate: add direct link to translation (engine) 2024-06-30 16:18:33 +02:00
Markus Heiser 0f9926b89a [fix] brave fetch_traits: layout of the settings page has changed
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 15:08:18 +02:00
Markus Heiser 39ffec87b7 [fix] engine zlibrary: handle seized domain
The domains of zlibrary instances are known to be seized from time to time.
This leads to problems when, for example, the automated tasks try to update the
engine traits (aka fetch_traits). The search function should also generate a
suitable error message (currently either SSL errors or empty result lists are
returned). [1]

[1] https://github.com/searxng/searxng/issues/3610
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 14:40:19 +02:00
Markus Heiser b8fa4d6195 [fix] bing news results return invalid images
Closes: https://github.com/searxng/searxng/issues/3502
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2024-06-25 11:12:41 +02:00
Grant Lanham 9a9ca307fe [fix] implement tests and remove usage of gen_useragent in engines 2024-06-23 11:51:41 +02:00
Richard Lyons f195d98bfb Fix search_url building. 2024-06-20 06:30:00 +02:00
Allen 13eec44b65 [fix] \!goi irrelevant results AND display more results 2024-06-16 16:45:03 +02:00
Bnyro e9f8412a6e [perf] torrents.html, files.html: don't parse and re-format filesize 2024-06-15 15:42:29 +02:00
Bnyro df15c21b35 [feat] mozhi: fix crash, support synonyms and definition 2024-06-15 11:33:09 +02:00
Bnyro 1fe13d0ba4 [refactor] duckduckgo: use extr helper function in get_vqd 2024-06-15 11:24:05 +02:00
Bnyro 46c5309888 [feat] mojeek: implement dedicated module 2024-06-07 11:31:05 +02:00
allendema_searxng_pi ee146dbc07 [enh] Add engine for discourse forums 2024-06-07 10:16:09 +02:00