Logo-amall

The docs currently say you can only add relationships and permissions to the storage/auth schemas, but what about indexes? It seems like these would be necessary in a lot of cases depending on queries that need to be run (or permissions that are set on the tables)

Last active 4 months ago

86 replies

13 views

  • SH

    The docs currently say you can only add relationships and permissions to the storage/auth schemas, but what about indexes? It seems like these would be necessary in a lot of cases depending on queries that need to be run (or permissions that are set on the tables)

  • EL

    I think that would be OK. What do you think @david.barroso?

  • EL

    But for now, you would have to do that directly on your production database, and not via migrations I think. Because the database user executing migrations does not have permission to modify the storage and auth schemas.

  • DA

    yeah, I think that makes sense. The constraint relates more to things that could break the services (some people attempted in the past to alter columns and even remove them causing auth and storage to break)

  • SH

    do you know how permissions work in terms of indexes? We have a weird situation where expanding the files table in the graphql request causes queries to go from 0.1s to 5+ seconds, at least locally. Is there a difference in the cli that would cause slower queries like this or some type of performance hit when querying across schemas? We aren’t sure yet if it’s also in a cloud project

  • SH

    But it seemed like maybe the files table needs more indexing in this case…will have to think about how we can manage it though if we aren’t able to use migrations

  • EL

    Do you have an example of such query?

  • EL

    0.1s to 5s+ sounds odd

  • DA

    yeah, if you could share more details that'd be appreciated. For instance, what does "expanding the files table in the graphql request" mean exactly?

  • SH

    i mean as an example we have a table like items that has a file_id column, the difference between these two queries ```query {
    items {
    file_id
    }
    }

    query {
    items {
    file {
    id
    name
    }
    }
    }```

  • SH

    the latter is vastly slower for reasons we don't understand

  • SH

    i do have a lot of permissions on the file table per our previous discussions so that's why i'm wondering if the files table needs indexing, but maybe there's something else to look into

  • EL

    from items to file Hasura should do a join based on the file_id which is already indexed (PK) so I wouldn't expect that to be slow.

  • EL

    Can you analyze the query and see the corresponding SQL?

  • EL
  • SH

    yeah that's why i was asking about the permissions

  • SH

    i agree the join is already indexed

  • SH

    but how do hasura permissions work?

  • SH

    if they are like queries at all then they could be extremely non-performant

  • SH

    we just verified that removing the permissions makes the query fast again

  • SH

    (running the query as admin is fine, it's when running as a user)

  • EL

    Then it's probably due to permissions as you say

  • EL

    For every query, Hasura has to also lookup permissions via the database. If you have complex permissions, that could lead to complex (and slow) SQL execution for permissions

  • SH

    but can the sql for permissions be alleviated with an index?

  • EL

    Yes, but it also depends on the permissions you have

  • SH

    and if so how would i do it since i can't do it with migrations? do i have to use psql directly?

  • SH

    e.g. in my case i bet files having a bucket_id index would be very useful

  • SH

    since i am doing a check for bucket_id --> _in:

  • SH

    the indexing that would help this query are all on the files table unfortunately

  • EL

    ah, thats probably it then. That's going to be a linear scan probably

  • EL

    Can you manually add an index in bucket_id and see if that speeds things up?

  • SH

    how do i manually add? psql?

  • EL

    yes

  • EL

    Do you do these tests locally?

  • SH

    yeah i'm currently testing locally

  • SH

    i jsut need to find example sql for creating the index

  • EL
    psql postgres://postgres:postgres@localhost:5432/postgres -c 'CREATE INDEX idx_storage_files_bucket_id ON storage.files(bucket_id)'
    
  • SH

    thank you

  • SH

    do you know if an index takes effect right away or if we would have to restart nhost? that didn't seem to help but i will keep testing manually to see where the bottleneck is

  • SH

    thanks for the help

  • EL

    I think it takes effect right away

  • EL

    If you can share your permissions, here or in DM, maybe I can try to help

  • SH

    just figured it out

  • SH

    it's the JIT issue again (just-in-time compilation that postgres has on by default)

  • SH

    it really really does not play well with hasura permissions

  • SH

    https://discord.com/channels/552499021260914688/1037805225030987846/1037805225030987846

  • SH

    i had written a script as part of my production migrations to set JIT off on sql but it's not in the regular hasura migrations so it doesn't get applied to local cli

  • SH

    7 second query went to 0.19s with JIT turned off

  • SH

    lots of other hasura users with the same problem that i mentioned in that thread: https://github.com/hasura/graphql-engine/issues/3672

  • SH

    david mentioned you may be able to add a setting for that at some point -- would be useful probably for anyone with reasonably complex permissions since the performance issue is really severe. no idea why it does that but somehow the query plan becomes completely crazy

  • DA

    maybe we should investigate turning the JIT off by default

  • EL

    Interesting. What's the downside of turning off JIT?

  • DA

    in theory it could make some queries slower, in practice the JIT seems to be being a bit counterproductive. My guess is that it may have to do with how dynamic queries turn out to be due to permissions but it is a shame hasura folks didn't seem to investigate the issue (or at least didn't comment on it)

  • DA

    but might be worth measuring the impact on simpler cases compared to sheena's case and decide which default value might be better. Allowing to turn this off/on consistently in the cli/cloud would be a nice thing to do too

  • DA

    probably something to think about once we are done with the config project

  • SH

    yeah would be interesting to see if there are any cases where JIT is actually helpful with hasura; it does seem like it's counterproductive for most people

  • SH

    it's interesting bc hasura without JIT can handle these really complex permissions with pretty great response times

  • SH

    i notice even when i simplify permissions JIT will still make the queries 2x as slow, though when the permissions are very simple it's barely noticeable

  • SH

    like it will be 0.2s vs 0.4s

  • SH

    once the permissions start getting a little more complex it balloons into 10 and 20 second queries

  • SH

    @david.barrosowhat's the best way to automatically apply this and have it persist on my postgres when running the cli locally? it seems like just running things in the hasura console doesn't persist. i tried adding it as a seed though that's not really what seeds are for but it also doesn't persist

  • EL

    @sheena do you mean the bucket_id index?

  • SH

    like is there something in the config.yaml that i could use to apply it so that it always applies to my local cli (and for all engineers on my team for every new nhost up)

  • SH

    it seems like possibly otherwise it has to be set every time the postgres container starts? or i'm otherwise unsure. i can set it with a psql script or in the console but then the setting is lost every time

  • EL

    I'm pretty sure there is no official way of handling this at the moment. We need to come up with one.

  • SH

    but there are some overrides on the docker contianer, right?

  • SH

    https://github.com/hasura/graphql-engine/issues/3672#issuecomment-580018608

  • EL

    adding @Nuno Pato here too, so he's aware of this conversation

  • SH

    bc people seem to do it in the docker-compose like command: postgres -c jit=off

  • SH

    i know the config.yaml allows some overrides of things in the docker-compose but i'm just not sure which ones

  • EL

    @lipchyk Correct me if I'm wrong, but we don't have support for such command in our config/CLI for specific services?

  • DA

    no, we don't. And I don't think we should add it either. It is just going to pollute the config file and enable people to do things they won't be able to do in the cloud anyway

  • DA

    @sheena how are you disabling the jit in the cloud right now?

  • SH

    i ran this as a one-time sql script at the same time of running all my other migrations from my previous heroku graphql instance (and the permissions sql script you helped me with)

  • SH

    ```ALTER DATABASE :DBNAME SET jit TO 'off';
    SHOW jit;````

  • SH

    if i do this locally though it is lost every time you restart

  • SH

    (and even in the cloud it's not ideal; i assume it's possible that the postgres instance will lose the setting again at some point, it's just less frequent than local cli….but for us if it happens randomly on the cloud it will make some queries completely unusable)

  • LI

    we don't

  • SH

    ok, thanks. is there some other way then to hook into the postgres start and apply my own settings if there aren't any short-term plans to add JIT control or other ways to control postgres settings (i.e. maybe some way to edit postgresql.conf)? the issue with JIT and hasura permissions is unfortunately so severe for us that it makes our app really difficult to develop even locally with only a handful of objects in the DB, so we need some way to make sure it's always applied even if we stop and start the dev environment. (that example i gave above with <=0.1s to 5-7+s was a query returning only 5 objects)

  • LI

    /cc @Nuno Pato

  • NU

    @sheena the setting should be kept across restarts if you are not deleting the .nhost folder

  • NU

    so you shouldn't have to disable it on every restart

  • NU

    we do have plans for more fine grained controls to the postgres instance but it is not a top priority atm - it will require some work to investigate and implement everywhere. I will think if there is any quick solution for your specific problem if for some reason that config is not being kept (which is weird)

  • SH

    ok, i'll try to debug what's happening more

  • SH

    and i put in a vote for prioritizing at least the JIT setting! i understand that the postgres settings aren't that high priority for you at the moment, but i think the JIT setting is very hasura-specific and affects enough hasura users that a hasura-dependent project like nhost should consider prioritizing that setting and maybe at least run some tests like david said. we're lucky that nhost recently switched to allowing separate postgres connection strings so we can at least run a psql command, or it would be a complete dealbreaker and we wouldn't be able to use nhost at all

  • DA

    what I’d suggest you to do in the meantime is to write some wrapper shell script to start nhost locally and then run the psql script to turn the jit off. I can help with the shell script if you want. Should be just a few lines

Last active 4 months ago

86 replies

13 views